Overview
Join us as an AI Ops Engineer, to build and run an enterprise AI Factory within our Card Merchant Services organisation, enabling AI‑driven change across the merchant payments lifecycle.
This role focuses on acquiring, risk and fraud, and merchant servicing, delivering a secure, scalable, and well‑governed AI platform that operates effectively in a highly regulated payments environment. You will be accountable for the end‑to‑end operationalisation of AI, spanning model, prompt, and agent lifecycles; deployment and monitoring; guardrails; and cost optimisation, ensuring AI solutions are production‑ready, auditable, compliant, and scalable across merchant payment use cases.
You will be accountable for the end‑to‑end engineering of GenAI and ML platforms, embedding governance, observability and operational resilience by design, while enabling teams to deploy and run AI solutions with clarity, assurance and accountability at scale.
Responsibilities
- Supporting production‑scale LLMOps / AgentOps lifecycles, including CI/CD for models, prompts and agents, versioning, structured evaluation, controlled releases, and monitoring of drift, hallucination and agent behaviour.
- Contributing to the build and operation of cloud‑based AI platforms on AWS, including services such as Amazon Bedrock and agent orchestration capabilities, alongside solid Python development skills and an understanding of secure API and microservices design.
- Supporting AI platforms with embedded governance approaches, including policy‑as‑code, guardrails, alignment to model risk frameworks, and maintaining lifecycle traceability with audit‑ready evidence.
- Applying observability and reliability practices to AI platforms, including contributing to service level measures and monitoring latency, cost, quality and failure modes, supported by tools such as CloudWatch and OpenTelemetry.
- Understanding of AI cost and performance considerations, including working with token usage, model selection, caching approaches, and balancing trade‑offs between latency, cost and output quality.
Highly Valued Skills
- Retrieval Augmented Generation (RAG) and vector database implementation, with practical experience using technologies such as OpenSearch, FAISS or similar to support scalable, production‑ready retrieval workflows.
- Data pipeline engineering, building and operating AI‑ready pipelines using AWS Glue, S3 and related services to support model training, inference and evaluation.
- Advanced observability and reliability engineering, including experience with CloudWatch, OpenTelemetry and established production resilience patterns for AI workloads in critical banking systems.
Location
This role will be based in London.
Values & Expected Behaviour
All colleagues will be expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence and Stewardship. They will also be expected to demonstrate the Barclays Mindset – to Empower, Challenge and Drive. The four LEAD behaviours are: L – Listen and be authentic, E – Energise and inspire, A – Align across the enterprise, D – Develop others.
#J-18808-Ljbffr…
