Company: London Stock Exchange

Apply for the Senior Site Reliability Engineer

Location: Nottingham

Job Description:

Requirements

Bachelor’s Degree or equivalent experience in Computer Science, Engineering, or a related field
5+ years of hands‑on technical experience in SRE, Platform Engineering, Infrastructure, or related roles
Strong experience with AWS (or Azure), including services such as EKS, ECS, EC2, networking, IAM, and managed services
Solid understanding of cloud security principles and experience collaborating with security teams
Strong background in Linux systems administrations
Proven experience designing and operating observability platforms, including monitoring, logging, and alerting
Hands‑on experience with Datadog for metrics, logs, APM, and alerting
Strong understanding of SRE principles, including SLOs, error budgets, incident management, and reliability engineering
Experience working closely with architecture and engineering teams on system design and delivery
Experience with cloud cost optimization strategies and tooling
(Desirable) Experience supporting multi‑cloud or hybrid environments
(Desirable) Exposure to Infrastructure as Code (e.g., Terraform, CloudFormation)
(Desirable) Experience in large‑scale, complex, or regulated environments
(Desirable) Knowledge of vector databases and RAG architectures for building internal SRE knowledge assistants
(Desirable) Knowledge of Generative AI and LLM platforms (e.g., Claude, Amazon Bedrock)
Strong technical authority with the ability to influence design and operational decisions
Highly collaborative, comfortable working across architecture, engineering, security, and operations teams
Calm and methodical under pressure, especially during incidents and critical issues
Pragmatic problem‑solver who balances reliability, security, cost, and delivery speed
Clear communicator, able to explain complex technical concepts to diverse audiences

What the job involves

We are evolving our Site Reliability Engineering capabilities to strengthen reliability, observability, security, and operational excellence across our Risk Intelligence division
As a Senior SRE, you will be a senior hands‑on technical person help shape the foundations of reliability across both new and existing platforms
You will collaborate with Architecture, Engineering, Security, and Platform teams to ensure reliability is built into systems from day one
While this is not a people‑management you will work closely with global teams and may occasionally be called upon for major incidents or critical issues
This position requires a highly proactive, hard‑working expert with strong leadership presence and ownership of platform reliability outcomes
We are looking for a person who is passionate about reliability engineering and who bring a continuous improvement approach to everything they do!
Lead the establishment of SRE foundations for new projects building environments, monitoring, alerting, and ensuring operational readiness from day one
Define, implement, and champion observability standards, tooling, and guidelines across metrics, logs, traces, and SLIs/SLOs
Design and evolve monitoring and alerting solutions that improve visibility, reduce toil, and strengthen system health
Continuously drive reliability improvements across our environments through incident reduction, performance tuning, and building resilient patterns
Partner with Security teams to ensure our platforms meet compliance, security, and risk‑management expectations
Influence architectural and design decisions through data‑driven cloud cost optimization and efficiency initiatives
Be a technical leader and mentor supporting engineers, shaping engineering standards, and fostering a culture of learning and development

#J-18808-Ljbffr…

Posted: June 1st, 2026