A pioneering research-focused engineering team is seeking Systems Research Engineers to help shape the next generation of AI-native infrastructure. This role sits at the intersection of systems research and large-scale engineering, focusing on distributed architectures that support the training, serving, and deployment of advanced AI models.
The rapid evolution of large-scale AI models is transforming how modern computing systems are designed and deployed. A highly advanced research-driven engineering group is building the next wave of infrastructure that powers intelligent systems at scale—redefining how models are trained, served, and optimised across distributed environments.
This role offers a unique blend of hands-on engineering and forward-looking research, ideal for engineers who want to push the boundaries of distributed systems and AI infrastructure while working on real-world, high-impact platforms.
What You’ll Work On
- Build and experiment with distributed system components tailored for data-intensive and AI-driven workloads.
- Design scalable infrastructure capable of operating across diverse hardware environments including CPUs, GPUs, and accelerators.
- Develop high-performance model serving systems with a focus on efficiency, scalability, and resilience.
- Analyse system behaviour using profiling tools to uncover performance bottlenecks and optimisation opportunities.
- Improve memory usage, caching strategies, and scheduling efficiency in large-scale inference systems.
- Create solutions that enable low-latency, multi-tenant AI services in distributed environments.
- Explore and prototype new approaches to inference architecture and cluster-level orchestration.
- Translate technical innovations into tangible outcomes, including internal adoption and external publications.
- Work closely with global teams to shape long-term infrastructure direction and strategy.
What You’ll Bring
- PhD in Computer Science, Electrical Engineering, or a related discipline.
- Strong foundation in distributed systems and operating systems principles.
- Understanding of machine learning infrastructure and large-scale model serving.
- Experience with systems-level programming in C/C++.
- Proficiency in Python for experimentation and rapid prototyping.
- Familiarity with distributed algorithms and system design trade-offs.
- Experience using performance analysis and profiling tools.
- Ability to communicate complex ideas clearly and work effectively in collaborative environments.
- Doctoral research in distributed systems, large-scale infrastructure, or AI platforms.
- Contributions to recognised systems or machine learning conferences.
- Hands-on experience with load balancing, fault tolerance, or cluster scheduling.
- Exposure to distributed caching, state management, or high-performance cloud systems.
- Experience building or optimising large-scale AI or cloud infrastructure.
Why This Role Stands Out
- Be part of a team shaping the infrastructure behind next-generation AI systems.
- Work on problems that combine deep technical research with real-world deployment.
- Gain exposure to cutting-edge architectures in distributed computing and AI.
- Collaborate with globally recognized experts in systems and machine learning.
- Opportunity to publish, innovate, and influence future technology directions.
- Accelerate your career in one of the fastest-growing areas of technology.
If you’re a motivated and skilled professional ready for your next challenge, apply now or send your CV to
…
