Lead Platform Engineer

Company: Alfa AI
Apply for the Lead Platform Engineer
Location: London
Job Description:

We’re building the platform layer that powers industrial-scale AI infrastructure.

This is the foundation supporting next-generation AI workloads across enterprise, government, research, and frontier AI environments. The challenges are complex, the scale is significant, and the engineering standards are exceptionally high.

We’re looking for a hands-on Platform Engineering Lead who can combine technical leadership with deep engineering expertise. This is a role for someone who enjoys solving difficult infrastructure problems, mentoring strong engineers, and contributing directly to production systems.

The Role

As Platform Engineering Lead, you’ll help design, build, and scale the infrastructure platform that enables high-performance AI workloads to run reliably across large-scale GPU environments.

You’ll work across Kubernetes, bare metal infrastructure, networking, distributed systems, and platform tooling, helping shape both the technical direction and engineering culture of the team.

This is a leadership role, but we’re looking for someone who still enjoys writing code and remains close to the technology.

Responsibilities

  • Lead the design and development of scalable platform infrastructure for AI workloads.
  • Build and maintain Kubernetes-based platforms running on bare metal environments.
  • Design and optimise GPU provisioning, scheduling, and deployment capabilities.
  • Develop platform services and tooling using Python, Golang, or Rust.
  • Architect and troubleshoot networking across clusters, distributed systems, and large-scale environments.
  • Collaborate with infrastructure, platform, and AI engineering teams to deliver reliable, production-grade systems.
  • Drive engineering best practices across reliability, observability, security, and performance.
  • Mentor engineers and provide technical leadership across the platform function.
  • Contribute hands‑on to architecture, implementation, and operational excellence.

What We’re Looking For

  • Strong experience in Kubernetes platform engineering.
  • Proven experience working with bare metal infrastructure or bare metal cloud environments.
  • Hands‑on experience provisioning and deploying workloads across GPU environments.
  • Deep understanding of networking concepts across clusters and distributed systems.
  • Strong software engineering skills with production‑grade development experience in Python and either Golang or Rust.
  • Excellent Linux systems knowledge.
  • Experience building and operating distributed systems at scale.
  • Previous experience leading technical initiatives, mentoring engineers, or providing technical leadership.
  • Hands‑on bare metal infrastructure experience is essential.

Ideal Background

  • Experience supporting AI, machine learning, or high-performance computing workloads.
  • Exposure to large-scale infrastructure platforms serving enterprise or research environments.
  • Strong understanding of platform reliability, observability, and automation practices.
  • Passion for solving complex infrastructure challenges and building systems that operate at scale.

Why Join?

You’ll have the opportunity to build critical infrastructure at the forefront of the AI industry, working on systems that power some of the most demanding workloads in the world.

This is a chance to shape platform architecture, influence engineering direction, and work alongside exceptional engineers tackling problems at the intersection of infrastructure, distributed systems, and artificial intelligence.

#J-18808-Ljbffr…

Posted: June 15th, 2026