Company: Infoplus Technologies UK Limited

Apply for the Data Engineer

Location:

Job Description:

Key responsibilities on this engagement
• Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + OpenSearch + Athena) and deliver written gap findings.
• Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power).
• Tune data orchestration — Glue jobs, Athena queries, S3 Tables config, scheduling. Lead the deep-dive technical sessions with analysts on visualization requirements
• Build and validate the simulation data onboarding pipeline against real data — including the 30 GB-per-run acoustic spectra dataset.
• Configure and validate the OpenSearch k-NN vector store and the Bedrock embedding pipeline.
• Author the AI/ML data export format specification and the AI onboarding pattern document.
• Co-design the API middleware blueprint with the Cloud Infrastructure Architect.
Must-have
Principal-level hands-on data engineering on AWS — 7+ years
Deep production experience with S3, S3 Tables, Glue, Athena, and OpenSearch
(including k-NN / vector search)
Built and shipped vector embedding workloads
Strong metadata modelling and data taxonomy design experience for scientific
or engineering domains
Comfort working with Parquet, JSON-LD, and large binary scientific data formats
(mesh, time-series, spectra)
Python proficiency; PySpark / Glue job tuning experience

…

Posted: May 28th, 2026