Venesky‑Brown’s client, a public sector organisation in Edinburgh, is currently looking to recruit an AI Python Engineer for an initial 12 month contract with option to extend on a rate of £600/day (Outside IR35). This role will be based in Edinburgh, however, attendance at the project site will only be required on an as‑needed basis.
Responsibilities
- Build and maintain the shared Python platform library that all workloads depend on — configuration, logging/telemetry, Azure clients, model interface abstractions.
- Hold a high engineering bar across the codebase: type safety, test coverage, linting, dependency hygiene.
- Keep the library’s abstractions clean as models, transports, and workloads rotate underneath them.
- Implement and maintain Temporal‑based workflow workers for the document processing pipeline (ingestion – extraction – reasoning/rule‑assertion – deterministic mapping).
- Build the plumbing that loads and serves open‑weight models inside workers (embedded inference engine pattern), including model provenance verification and warm‑load behaviour.
- Implement per‑queue scaling, priority isolation, and burst handling.
- Develop and maintain Terraform for the platform estate across non‑prod, pre‑prod and prod environments.
- Own the GitOps deployment path (ArgoCD) and the container build/publish pipeline into the registry.
- Operate workloads on AKS — namespaces, autoscaling (KEDA), service mesh, policy and security add‑ons.
- Build telemetry, dashboards and alerting (Managed Prometheus / Grafana, App Insights) shaped for first‑line consumption.
- Implement support automation as a first‑class platform layer — self‑healing operator patterns, runbooks‑as‑code — to minimise manual operational handover. (Directly supports the RUN‑readiness risk on this programme.)
- Implement validation and verification logic so extracted/derived data meets quality standards before it leaves the pipeline.
- Integrate the platform with enterprise systems (message bus, databases, document stores) and support the AI engineers in wiring new model workloads in.
- Work to the team’s design and task discipline (low‑level design templates, tightly scoped tasks, ADO tracking).
- Document architecture, runbooks and operational guidance to support deployment and ongoing support.
- Strong, demonstrable production Python — typed code (mypy/strict or equivalent), testing (pytest), linting, packaging and dependency management.
- Containers and Kubernetes in production: building images, deploying and operating workloads, debugging in‑cluster.
- Infrastructure as code — Terraform (or equivalent) with a modular, environment‑driven structure.
- CI/CD and GitOps — automated build/test/deploy pipelines; declarative deployment.
- Cloud platform engineering, ideally Azure (AKS, Service Bus, managed Postgres, Key Vault, Blob Storage, managed identity).
- Observability — metrics, logs, traces; building dashboards and alerts, not just consuming them.
- Comfort working around AI/ML workloads — integrating model‑serving runtimes, understanding inference resource behaviour — without needing to own model science.
- Experience delivering and operating services end‑to‑end, including the support and maintenance phase.
- Awareness of secure handling of sensitive data and relevant data‑protection obligations.
- Temporal.io or another durable‑execution / workflow‑orchestration framework.
- vLLM or similar LLM‑serving runtimes; familiarity with GPU workload scheduling on Kubernetes.
- KEDA, Istio (or another service mesh), ArgoCD.
- Experience supporting A/B model rollouts behind a stable interface (the reasoning queue has a model‑swap on the roadmap).
- Vector search infrastructure (e.g. Milvus / self‑hosted vector DB on Kubernetes) — a candidate roadmap component.
- Experience contributing to platform support‑automation / SRE‑style operability as a deliverable in its own right.
- Exposure to regulated / public‑sector delivery and associated governance.
If you would like to hear more about this opportunity please get in touch.
#J-18808-Ljbffr…
