Senior Platform Engineer (Site Reliability Engineering/Cloud)

Company: Beamery
Apply for the Senior Platform Engineer (Site Reliability Engineering/Cloud)
Location: London
Job Description:

Requirements

  • We are seeking a hands‑on Platform Engineer with a passion for Site Reliability Engineering (SRE) and Cloud technologies
  • Hands‑on SRE experience, including being on‑call and maintaining cloud‑based services
  • Experience managing Kubernetes production clusters at scale, including cluster upgrades, resource optimization (autoscaling, quotas), and troubleshooting complex networking or scheduling issues
  • Proven track record of managing Kubernetes lifecycles, implementing security best practices (RBAC, Network Policies), and maintaining high availability for containerised workloads
  • Experienced with Infrastructure as Code (IaC) principles, particularly proficient in Terraform
  • Expertise in building and maintaining complex applications in the cloud
  • Some operational experience with our key infrastructure components is desirable, Kafka, MongoDB, PostgreSQL, Elasticsearch, and Istio
  • Solid software engineering foundations with the ability to build and maintain production‑grade tooling. Experience with Go would be preferable. NodeJS is a plus
  • Pragmatic problem solver, balancing a number of factors to deliver the best possible solution

What the job involves

  • Deepening our native integrations with SAP, Workday, Microsoft, and LinkedIn to seamlessly embed our skills intelligence into the platforms where critical workforce decisions are made
  • Embedding our agentic AI to help customers plan smarter for the future—powering workforce strategies, internal mobility, and skills forecasting
  • Advancing our use of proprietary LLMs and knowledge graph technology to help organisations unlock broader talent pools, make fairer decisions, and expand access to opportunity at scale
  • The Platform team at Beamery enables engineering teams to build scalable and reliable services by supporting a foundation of tools and common services
  • These services include our managed Kubernetes clusters, observability tooling, and much more
  • We’re responsible for setting best practices around incident management and cost controls whilst empowering teams to build it, run it
  • Building and evolving Platform’s managed services and tools to enable teams to run scalable and reliable services
  • Driving continuous improvement in incident response across Platform and the wider engineering organisation, including SLOs, alerting, and dashboards
  • Working with cutting edge technology in an agile environment
  • Designing scalable and secure solutions to complex problems
  • Providing technical leadership and mentorship to other engineers

#J-18808-Ljbffr…

Posted: June 1st, 2026