Company: SGI

Apply for the Site Reliability Engineering Lead – London

Location: Greater London

Job Description:

Job Description

We’re looking for a true SRE leader with a strong software engineering background. This isn’t a DevOps “on-call only” role — you’ll need to be comfortable reading and writing production code, deeply understanding application behaviour, and working alongside developers as a technical peer.

You’ll lead and mentor the SRE team, setting direction and raising the bar for reliability across our systems. You’ll take end-to-end ownership of production, ensuring availability, performance, and effective incident response, while defining SLIs and partnering with Product on meaningful SLOs and error budgets.

In practice, that means you’ll:

Own production systems (availability, performance, incident response)
Define SLIs/SLOs and use error budgets to guide decisions
Run incident management, on-call, and blameless postmortems
Get hands-on with code (PHP, Java/.NET) to troubleshoot and improve reliability
Drive automation and reduce operational toil
Build observability that gives real insight into system health
Partner with engineers to embed reliability into the SDLC

A big part of the role is shaping culture — creating a blameless environment, improving how we respond to incidents, and driving continuous, systemic improvements. You’ll also lead on capacity planning, performance optimisation, and cost efficiency as the platform scales.

We’re looking for someone who brings strong technical leadership, communicates clearly (especially during incidents), and takes real ownership of problems through to resolution. You should be comfortable operating at scale, have deep experience with SLIs/SLOs, incident management, and observability tooling, and be at home working with Linux, databases, cloud platforms (ideally Azure), Kubernetes, and Infrastructure as Code. Just as importantly, you should enjoy tackling complex, imperfect systems — and turning them into something reliable, scalable, and well-understood.

…

Posted: March 31st, 2026

Job Description

Latest Job Pages: