Location: Gloucester (Hybrid, 3 days onsite)
Security Clearance: Must be eligible for UK Developed Vetting (DV)
We’re hiring a Site Reliability Engineer to join a high-performing engineering environment delivering critical, complex systems. This role sits at the intersection of software engineering and operations, with a strong focus on automation, scalability, and system resilience.
This is an excellent opportunity for someone with a software engineering background who is looking to move into a more systems-focused, reliability-driven career path without losing their hands‑on technical edge.
As an SRE, you’ll be responsible for ensuring the reliability, availability, and performance of mission‑critical systems. You’ll apply software engineering principles to infrastructure and operations challenges, reducing manual effort through automation and improving system design.
Key Responsibilities Include:
- Supporting and maintaining live services, ensuring high availability and performance
- Automating operational processes to reduce manual intervention
- Monitoring, alerting, and observability improvements across systems
- Diagnosing and resolving incidents across the full technology stack
- Working closely with engineering teams to influence system design and reliability
- Participating in an on‑call rota (project‑dependent)
- Contributing to continuous improvement of DevOps and SRE practices
What We’re Looking For
We’re interested in candidates who bring a strong engineering mindset and enjoy solving complex systems problems.
Core Experience:
- 2+ years commercial experience in this area
- Experience working with cloud platforms (AWS, Azure, or similar)
- Strong Linux/Windows command line skills (Bash, PowerShell)
- Understanding of distributed systems, scalability, and resilience
- Experience with monitoring/observability tools (e.g. ELK stack or similar)
- Familiarity with containers and microservices (e.g. Docker)
- Experience troubleshooting across infrastructure and application layers
- Exposure to 2nd or 3rd line support environments
- Knowledge of CI/CD and deployment tooling
- Experience with infrastructure as code or configuration management toolsUnderstanding of ITIL or service management practices
Additional Requirements
- Willingness to participate in on‑call support (depending on project)
If you’re a software engineer looking to broaden your impact into reliability, systems, and large‑scale infrastructure, this role offers a strong platform to do exactly that.
#J-18808-Ljbffr…
