Site Reliability Engineer

Company: Intellegens
Apply for the Site Reliability Engineer
Location: Cambridge
Job Description:

We are seeking a Site Reliability Engineer to maintain and develop our cloud infrastructure and monitoring systems

Key features

Location: Cambridge

Fantastic opportunity to help the business develop and thrive

Full time hybrid working

The opportunity

We are seeking a Site Reliability Engineer to maintain and develop our cloud infrastructure and monitoring systems and processes, helping to ensure the reliability and security of the service we provide to our customers. This role will report into our Head of Platform.

Main duties and responsibilities

You will be responsible for the continued development of our monitoring systems and use them to proactively identify and communicate performance, reliability, security and cost issues. You will assist in responding to incidents and the remediation of vulnerabilities in our platform. You will also identify, plan and implement improvements to our cloud infrastructure and deployment processes in a secure and robust way, working alongside other engineers to support our product roadmap. As part of the wider product engineering team you will advocate throughout the design process for effective monitoring to ensure the performance, stability and security of our products in line with our commitment to ISO 27001 compliance.

What makes you our next Site Reliability Engineer?

  • Minimum Bachelor 2:1 degree in computer science or a related field
  • 2+ years experience in a professional DevOps, SRE, Platform Engineering or similar role
  • Self-motivated with strong problem-solving and analytical skills
  • Experience using and configuring monitoring tools, ideally Grafana and Prometheus, to identify insights and alert to potential issues
  • Experience using and configuring cloud infrastructure (ideally GCP but Azure also desirable)
  • Experience with IaC tools (ideally Terraform)
  • Experience with Docker, Kubernetes and Helm
  • Knowledge of security and reliability best practices for cloud infrastructure and application deployments to kubernetes
  • Experience using Python and Bash for scripting or small CLI applications
  • Experience using Git for professional software development
  • Experience responding to and investigating security or reliability incidents in a distributed cloud environment
  • The ability to communicate technical challenges and opportunities to people outside your area of expertise
  • Some familiarity with the applications in our tech stack: NGINX, Flask (Python), React (TypeScript), PostgreSQL, Opensearch, Valkey, Keycloak
  • Knowledge of administering Linux based systems
  • Experience using CI tools, ideally CircleCI, to manage application deployments
  • Experience applying and monitoring compliance with information security policies
  • Experience applying Agile methodologies and working in sprints

The above is not an exhaustive list and you are required to be flexible in your approach to carrying out your duties which may change from time to time in order to reflect business needs or the company’s continuous improvement.

What can we offer you?

  • A competitive financial package – salary and share options
  • 5 weeks annual leave pro rata, flexible leave policy
  • Salary sacrifice pension, with company savings being paid into the scheme
  • A collaborative work environment with neither red tape nor bureaucracy
  • Scope for career development as an early team member
  • Support and resources to develop your skills and succeed in the role
  • Hybrid working arrangements and a great team culture
  • Access to an EAP, wellbeing champion, and financial advice
  • Enhanced sickness policy
  • Regular social and team building events

#J-18808-Ljbffr…

Posted: April 29th, 2026