Position Overview
A serious 0→1 build. Brand new team. Fully remote across Europe. That alone already says a lot.
Whether you’re working from a beach in Barcelona or a cottage in the Cotswolds, your location is secondary to the code you ship. This role isn’t about tracking desk hours; it’s about the impact of what you’re building. I’m looking for a Senior ML Engineer specialising in inference optimisation.
This is a chance to join a fast growing tech unicorn building a next generation AI platform for developers. The company originally made its name by helping engineering teams dramatically optimise cloud spend. That same mindset is now being applied to AI infrastructure, helping developers build, deploy and scale AI powered features faster, more efficiently, and at a significantly lower cost.
You’ll be joining early in a brand new team, working on a genuine 0→1 product build where engineering quality and performance really matter. This isn’t about academic benchmarks or theoretical work, it’s about solving real production problems at scale.
Responsibilities
- Day to day, you’ll be deep in inference optimisation work using tools like vLLM, Triton, SGLang, and TensorRT.
- The focus is on pushing performance in the real world: reducing latency, improving model initialisation times and building distributed systems that make high performance AI both accessible and cost efficient.
Qualifications
- If you’re someone whose is an expert in Python with proven experience in ML inference optimisation.
- You’ll ideally have hands on experience tuning inference engines and working with production scale systems using tools like vLLM, Triton, SGLang, and TensorRT.
How to Apply
If you’d like to find out more, you can apply below or email ethan.farrell@linuxrecruit.co.uk
#J-18808-Ljbffr…
