Company: TechNET IT Recruitment Ltd

Apply for the Lead Observability Engineer

Location: Basingstoke

Job Description:

Senior / Lead Observability & Cloud Infrastructure Engineer

We are seeking an experienced Senior / Lead Observability & Cloud Infrastructure Engineer to join a large-scale digital transformation programme. The successful candidate will play a key role in designing, implementing and enhancing observability capabilities across modern cloud‑native platforms, with a particular focus on Dynatrace.

This position requires a strong blend of hands‑on observability expertise, AWS infrastructure knowledge, and experience supporting distributed microservice‑based applications running in containerised environments.

Key Responsibilities

Lead the design, implementation and optimisation of Dynatrace monitoring solutions across complex cloud environments.
Configure and maintain dashboards, alerting frameworks and end‑to‑end observability for customer‑facing digital services.
Implement Dynatrace instrumentation and monitoring across cloud infrastructure, APIs, microservices, containers and databases.
Work closely with engineering, platform and operations teams to improve service visibility and operational resilience.
Analyse and troubleshoot performance, availability and reliability issues across distributed systems.
Support the adoption of observability best practices and drive continuous improvement initiatives.
Design and implement proactive alerting strategies to reduce incident impact and improve service reliability.
Document monitoring architectures, operational procedures and technical solutions.

Required Experience

Strong hands‑on experience implementing and administering Dynatrace in enterprise‑scale environments.
Experience deploying and configuring Dynatrace monitoring, dashboarding, alerting and integrations.
Strong AWS cloud experience including services such as:
- EC2
- ECS
- EKS
- Lambda
- S3
- RDS
- IAM
- VPC
- CloudFormation
Strong understanding of cloud‑native and microservice‑based architectures.
Experience working with container technologies including Docker, ECS and/or Kubernetes.
Strong troubleshooting and root cause analysis skills within distributed environments.
Experience with monitoring and observability tooling such as Dynatrace, CloudWatch and related platforms.
Knowledge of Infrastructure as Code and automation tooling including CloudFormation and/or Terraform.
Experience working within DevOps, Platform Engineering or Site Reliability Engineering environments.
Experience within large‑scale enterprise or consultancy‑led environments.
Knowledge of CI/CD pipelines and deployment automation.
Experience defining service‑level objectives (SLOs), KPIs and operational metrics.
Exposure to additional observability or APM platforms such as Datadog, AppDynamics, New Relic or Splunk.

#J-18808-Ljbffr…

Posted: June 6th, 2026