Site Reliability Engineer (SRE / Observability Technical Lead)
Company: Deepstreamtech
Location: London
Posted: May 20th, 2026
Requirements
- The ideal candidate will have deep expertise in Application Performance Monitoring (APM), Infrastructure as Code (IaC), automation, and distributed tracing using OpenTelemetry
- 5+ years of experience in SRE, Observability, or DevOps roles, with leadership responsibilities
- Proven expertise with Application Performance Monitoring (APM) tools such as New Relic, Datadog, AppDynamics, or Dynatrace
- Hands‑on experience with OpenTelemetry (OTel) for distributed tracing and observability instrumentation
- Strong proficiency in Infrastructure as Code (IaC) using Terraform
- Solid understanding of cloud platforms including AWS, GCP, or Azure
- Experience with automation/configuration management tools like Ansible, Chef, or Puppet
- Deep knowledge of CI/CD pipelines and tools such as GitHub Actions, Jenkins, or Azure DevOps
- Experience managing Kubernetes and containerized environments (Docker, Helm)
- Familiarity with log aggregation and analysis platforms like ELK Stack or Splunk
- Excellent leadership, communication, and collaboration skills
What the job involves
- We are seeking an experienced Site Reliability Engineer (SRE) / Observability Technical Lead to join our team and drive the strategy and execution of observability and reliability projects across our clients
- As a lead, you will guide the design, implementation, and continuous improvement of observability solutions, ensuring system reliability, performance, and scalability while fostering best practices in SRE and DevOps
- Lead the strategic development and management of observability and reliability frameworks across the organization, ensuring alignment with business goals and technical requirements
- Design and implementation of monitoring and observability solutions, collaborating with engineering teams to define standards and best practices
- Manage Infrastructure as Code (IaC) initiatives using Terraform, coordinating with cloud and infrastructure teams to ensure scalable and secure deployments
- Drive automation strategies for monitoring, alerting, and logging pipelines, focusing on process improvements and operational efficiency
- Develop and maintain comprehensive observability roadmaps, including distributed tracing, logging, and metrics collection strategies
- Collaborate with product management, sales, and pre‑sales teams to provide technical expertise and support during solution design and customer engagements
- Lead cross‑functional teams to enhance CI/CD pipelines and deployment reliability, ensuring smooth integration of observability tools and practices
- Engage with vendors and strategic partners to evaluate, select, and integrate observability and monitoring solutions, ensuring alignment with organizational needs and fostering strong collaborative relationships
- Mentor and develop junior engineers and analysts, fostering a culture of reliability, observability, and operational excellence
#J-18808-Ljbffr
Apply Now