en
Jobs

View all the latest job opportunities in Thailand. Write a new chapter in your career with Robert Walters today.

See all jobs

Submit your CV - Eastern Seaboard

Explore new job opportunities in the Eastern Seaboard.

Learn more
Candidates

Together, we’ll map out career-defining, life-changing pathways to achieve your career ambitions. Browse our range of services, advice, and resources.

Learn more
About Robert Walters Thailand

Since our establishment in 2008, our belief remains the same: Building strong relationships with people is vital in a successful partnership.

Learn more

Work for us

Our people are the difference. Hear stories from our people to learn more about a career at Robert Walters Thailand.

Learn more

Lead Site Reliability Engineer

Save job

Our client is seeking an experienced Lead Site Reliability Engineer to drive reliability strategy, operational excellence, and automation across global cloud infrastructure. This role is critical in ensuring platforms remain highly available, scalable, secure, and performant even during extreme traffic spikes or infrastructure failures.

As the Lead Site Reliability Engineer, you will combine deep technical expertise with leadership capability to build resilient distributed systems, lead incident response, define reliability standards.

Key Responsibilities

Global Reliability Strategy

  • Define and implement SRE vision, principles, and governance across global and regional teams.
  • Establish enterprise SLIs, SLOs, SLAs, and error budget frameworks aligned with business impact.
  • Standardize production readiness reviews and reliability assessments.

Platform Architecture & Scalability

  • Architect and govern highly available, multi-AZ / multi-region cloud infrastructure (AWS).
  • Lead Kubernetes platform strategy and container orchestration standards.
  • Drive Infrastructure as Code adoption (Terraform preferred) across regions.
  • Design global disaster recovery (DR) and business continuity strategies.
  • Ensure resilience and elasticity during high-traffic events

Operational Leadership

  • Own the global incident management framework (SEV classification, escalation, communication).
  • Lead major incident response and executive stakeholder updates.
  • Conduct root cause analysis (RCA) and champion blameless postmortems.
  • Drive measurable improvements in MTTD and MTTR.
  • Reduce operational toil through automation and platform engineering best practices.
  • Define enterprise observability strategy (metrics, logs, tracing).
  • Standardize monitoring frameworks and alert quality across regions.
  • Lead performance optimization initiatives (latency, throughput, resilience).
  • Improve deployment reliability using progressive delivery models (Blue/Green, Canary).

Security, Risk & Compliance

  • Enforce Least Privilege and Defense-in-Depth principles.
  • Partner with Security teams to embed DevSecOps practices.
  • Ensure compliance with global regulatory standards (SOC2, ISO, PCI where applicable).

Key Qualifications

  • At least 5 years of experience in SRE or (DevOps, Platform Engineering, or Cloud Infrastructure)
  • Proven leadership experience in global or multi-region environments.
  • Strong track record managing high-availability, mission-critical production systems.
  • Hands-on expertise with AWS and cloud-native architectures.
  • Deep knowledge of Kubernetes and container orchestration.
  • Infrastructure as Code (Terraform preferred).
  • Strong understanding of CI/CD, GitOps, and automation-first practices.
  • Experience with observability platforms (Prometheus, Grafana, ELK, Datadog, OpenTelemetry).
  • Strong networking and distributed systems knowledge.
  • Experience handling major incident management in high-pressure environments.
  • Excellent stakeholder communication skills with global and regional teams.

Due to the high volume of applications, our team will only be in touch if your application is shortlisted.

Robert Walters Recruitment (Thailand) Limited
Recruitment License No.: น. 1188 / 2551

Contract Type: Perm

Specialism: Tech & Transformation

Focus: Architecture

Industry: IT

Salary: Performance Bonus

Workplace Type: Hybrid

Experience Level: Senior Management

Location: Bangkok

Job Reference: 6NF05Z-1E344C75

Date posted: 27 February 2026

Consultant: Supapuck Siriprayoon