Job Description
- Contribute to building and evolving the platform (infrastructure + reusable abstractions) that standardises data engineering workloads (batch/streaming pipelines, data processing) and traditional ML workflows (feature engineering, training, batch/real-time serving) across teams
- Implement platform-level IaC, CI/CD, and environment management to support consistent, reproducible workloads across dev/test/prod
- Build and maintain components using Python and Spark for data processing, shared datasets, and platform services
- Contribute to shared services for data and ML lifecycle management (data pipelines, experiment tracking, versioning, lineage, permissions), aligned to enterprise governance (e.g. Unity Catalog)
- Support the implementation and operation of a centralised AgentOps capability (LLM gateway, tool integration, prompt and version management)
- Contribute to agent‑specific lifecycle and safety controls (evaluation pipelines, guardrails, access control), with guidance from senior engineers
- Enhance observability across both domains:
- Data & ML Ops: data quality, pipeline reliability, model performance
- Agent Ops: traces, responses, evaluations, cost and behaviour monitoring
- Contribute to problem solving across platform reliability, performance, and security for data, ML, and agent workloads
- Apply security and compliance best practices (RBAC/ACLs, secure configuration, identity and access management), supporting a secure-by-default platform design
- Collaborate with Data Engineers, Data Scientists, and ML Engineers to enable adoption of platform capabilities across ASOS Tech
- Contribute to documentation, standards, and best practices across the platform
Qualifications
- Experience in Data Platforms, Data Engineering, Cloud Engineering, or ML Platform Engineering roles, with exposure to Azure
- Strong hands‑on experience with Python and Apache Spark
- Experience with Azure data platform technologies such as Azure Databricks, ADLS Gen2, and Unity Catalog
- Working knowledge of security and access management (RBAC, ACLs, identity concepts such as Entra ID)
- Familiarity with Infrastructure as Code using Terraform
- Experience with CI/CD (Azure DevOps, GitHub Actions)
- Basic understanding of Azure networking (vNets, NSGs, Private Endpoints)
- Exposure to Docker/Kubernetes in cloud environments is beneficial
- Awareness of AgentOps patterns (LLM gateways, prompt/version control, evaluation, observability) is a plus
- Good communication and collaboration skills, with a strong focus on learning and continuous improvement
Benefits
- Employee discount (hello ASOS discount!)
- Employee sample sales
- 25 days paid annual leave + an extra celebration day for a special moment
- Private medical care scheme
- Fixed Annual Payment in addition to your salary each year, it’s just an extra thank you from us
- Opportunity for personalised learning and in-the-moment experiences that enable you to thrive and excel in your role
#J-18808-Ljbffr…
