Scope of service
As infrastructure engineers within the Integration GCP team, the scope of service includes:
- Must have: Design, Build, and Operate: Deliver and maintain secure, automated GCP API Management platform capabilities, supporting both API Gateway and broader integration products.
- Platform Enablement: Enable product teams to deliver API‑first services at pace, leveraging reusable patterns and robust integration tools.
- Must have: Infrastructure Automation: Develop and maintain Infrastructure as Code (IaC) solutions for provisioning and managing Azure resources, ensuring repeatability and compliance.
- Security & Compliance: Embed security best practices and controls throughout the platform lifecycle, safeguarding organisational and customer data.
- Must have: Performance & Reliability: Define, monitor, and operate against service level objectives (SLOs/SLIs), ensuring high availability, performance, and fault tolerance.
- Continuous Improvement: Drive automation, observability, and performance tuning to reduce manual effort and improve platform reliability.
- Must have: Collaboration: Work closely with architecture and feature teams to evolve the cloud roadmap and platform products, contributing to documentation and enablement.
- Mentoring & Standards: Mentor team members and uphold engineering standards, fostering a culture of continuous learning and improvement.
SRE role specifics in addition to above
- Must have: Be hands‑on engineering, maintaining our Infrastructure as Code and CI/CD pipeline‑based product and services by responding to change, implementing enhancements & improving reliability and customer experience.
- Must have: Observing, investigating & fixing service issues, with an engineering attitude – resolving via code changes and implementing improvements to prevent repeat issues.
- Must have: Implementing further automation and reducing toil, by utilising existing Cloud tooling or implementing new technologies.
In-scope technologies/products
- GCP Cloud
- GCP Networking
- GCP Load Balancer
- GCP Storage
- Hashicorp Terraform
- Hashicorp Vault
- Containers
- Backstage
Skillset
- Cloud Platform Engineering: Proven experience designing, building, and operating secure, automated cloud platform capabilities, with a focus on Azure (and readiness to cross‑train in GCP if needed).
- Infrastructure as Code: Proficiency with Terraform (minimum), Jenkins, and modern CI/CD systems (GitHub Actions, Harness, Jenkins).
- API Management: Deep understanding of GCP API Management (Apigee) infrastructure, and API Gateway solutions. Familiarity with API design and security (REST/OpenAPI, authentication/authorisation, mTLS, certificate lifecycle).
- Networking & Security: Experience with GCP Cloud Armor, GCP Networking, and embedding secure‑by‑design controls from design to runtime.
- Containers & Orchestration: Hands‑on with GCP Kubernetes Service (GKE), containers, and service‑mesh patterns (e.g., Istio).
- Automation & Observability: Implementing actionable observability, performance tuning, and automation to reduce toil. Defining and operating against SLOs/SLIs.
- Scripting & Tooling: Scripting in Bash, PowerShell, or Python. Familiarity with HashiCorp Vault, Harness, and Backstage is desirable.
- Collaboration & Mentoring: Ability to mentor engineers, contribute to communities of practice, and uphold platform engineering standards.
- Certifications: Relevant GCP certifications are desirable.
SRE role specifics in addition to above
- Relevant Certifications to the required Service (GCP)
- Strong DevOps understanding, including experience of Infrastructure as Code and CI/CD pipelines, such as Terraform and Jenkins.
- - Ability to quickly understand, update and write code in languages such as Python, Groovy, BASH, PowerShell
- - Strong knowledge of Infrastructure as Code and creating modular, easy to maintain code
- - A strong understanding of Cloud security, networking and APIs
- - Experience in problem-solving, able to demonstrate logical thinking and excellent troubleshooting skills
- - Hands‑on with Observability Tooling (Observability as Code and SLO‑based Dynatrace Monitoring)
- - Strong understanding and demonstrable use of source control practice and collaborative working as part of an engineering team
- - Experience of developing and administrating Kubernetes clusters in a production environment
- - Strong experience in automating to remove toil
- - Strong knowledge of incident management and issue resolution
- - Able to demonstrate a passion to continue to learn and develop your engineering skills.
This role requires hybrid work from client site either in Leeds or Bristol.
#J-18808-Ljbffr