Infrastructure & MLOps Engineer

{ “@context”: “http://schema.org”, “@type”: “JobPosting”, “title”: “Infrastructure & MLOps Engineer”, “description”: “

Requirements

  • Knowledge of Python
  • Familiarity with cloud services (e.g. AWS)
  • Experience managing or developing in Linux environments
  • Understanding of CI/CD principles
  • Experience using Kubernetes (k8s)
  • (Desirable) Experience maintaining machine learning applications
  • (Desirable) Experience deploying ML orchestration tools (e.g. NV Ray, KFP, SkyPilot)
  • (Desirable) Experience managing ML accelerator hardware (e.g. DCGM)
  • (Desirable) Experience with Infrastructure as Code (IaC) tools (e.g. Terraform/OpenTofu)
  • (Desirable) Experience with GitHub Actions
  • (Desirable) Experience with modern observability tooling (e.g. Prometheus)
  • (Desirable) Experience with Grafana
  • (Desirable) Knowledge of Go/Java/C++ (or similar language)

What the job involves

  • Join our dynamic Software Infrastructure team and take a pivotal role in scaling and managing our infrastructure
  • You will develop essential tools and services that empower our broader software team
  • Your contributions will enhance the build, test, deployment, and productisation processes of our Machine Learning Software components
  • Work with our High-Performance Computing (HPC) AI platforms and gain invaluable experience in distributed systems
  • The Software Infrastructure team provides critical platforms and services for software development teams across the business
  • Our responsibilities include managing the CI platform and services, build engineering, component integration, and packaging and release systems
  • We operate in squads, fostering a culture of service ownership and empowerment for our engineers
  • We focus on long‑term engineering solutions and strive to eliminate toil wherever possible
  • Develop, own, and maintain tools and services to support AI research and engineering teams
  • Deploy and maintain services with Kubernetes and Docker
  • Manage our Cloud Infrastructure using tools such as Terraform

#J-18808-Ljbffr”, “datePosted”: “2026-05-18”, “hiringOrganization”: { “@type”: “Organization”, “name”: “Deepstreamtech”, “sameAs”: “https://uk.whatjobs.com/pub_api__cpl__435626884__4861?utm_campaign=publisher&utm_medium=api&utm_source=4861&geoID=22” }, “jobLocation”: { “@type”: “Place”, “address”: { “@type”: “PostalAddress”, “addressLocality”: “Bristol” } } }
Company: Deepstreamtech
Apply for the Infrastructure & MLOps Engineer
Location: Bristol
Job Description:

Requirements

  • Knowledge of Python
  • Familiarity with cloud services (e.g. AWS)
  • Experience managing or developing in Linux environments
  • Understanding of CI/CD principles
  • Experience using Kubernetes (k8s)
  • (Desirable) Experience maintaining machine learning applications
  • (Desirable) Experience deploying ML orchestration tools (e.g. NV Ray, KFP, SkyPilot)
  • (Desirable) Experience managing ML accelerator hardware (e.g. DCGM)
  • (Desirable) Experience with Infrastructure as Code (IaC) tools (e.g. Terraform/OpenTofu)
  • (Desirable) Experience with GitHub Actions
  • (Desirable) Experience with modern observability tooling (e.g. Prometheus)
  • (Desirable) Experience with Grafana
  • (Desirable) Knowledge of Go/Java/C++ (or similar language)

What the job involves

  • Join our dynamic Software Infrastructure team and take a pivotal role in scaling and managing our infrastructure
  • You will develop essential tools and services that empower our broader software team
  • Your contributions will enhance the build, test, deployment, and productisation processes of our Machine Learning Software components
  • Work with our High-Performance Computing (HPC) AI platforms and gain invaluable experience in distributed systems
  • The Software Infrastructure team provides critical platforms and services for software development teams across the business
  • Our responsibilities include managing the CI platform and services, build engineering, component integration, and packaging and release systems
  • We operate in squads, fostering a culture of service ownership and empowerment for our engineers
  • We focus on long‑term engineering solutions and strive to eliminate toil wherever possible
  • Develop, own, and maintain tools and services to support AI research and engineering teams
  • Deploy and maintain services with Kubernetes and Docker
  • Manage our Cloud Infrastructure using tools such as Terraform

#J-18808-Ljbffr…

Posted: May 18th, 2026