Staff Cloud SRE: AI/ML Platform & GPU Compute

{ “@context”: “http://schema.org”, “@type”: “JobPosting”, “title”: “Staff Cloud SRE: AI/ML Platform & GPU Compute”, “description”: “

Deepstreamtech is looking for a Staff Site Reliability Engineer to shape the reliability of large-scale AI systems and GPU compute infrastructure. In this foundational role, you will establish reliability frameworks and operational standards to ensure the performance of cloud infrastructures.

Your responsibilities will span from defining SLOs to participating in a 24/7 on-call rotation. Ideal candidates will have strong experience in SRE roles, particularly in GPU environments, Kubernetes, and cloud platforms like AWS, GCP, or Azure.

#J-18808-Ljbffr”, “datePosted”: “2026-05-20”, “hiringOrganization”: { “@type”: “Organization”, “name”: “Deepstreamtech”, “sameAs”: “https://uk.whatjobs.com/pub_api__cpl__436984645__4861?utm_campaign=publisher&utm_medium=api&utm_source=4861&geoID=33” }, “jobLocation”: { “@type”: “Place”, “address”: { “@type”: “PostalAddress”, “addressLocality”: “London” } } }

Company: Deepstreamtech

Apply for the Staff Cloud SRE: AI/ML Platform & GPU Compute

Location: London

Job Description:

#J-18808-Ljbffr…

Posted: May 20th, 2026