Developer

Company: Queen Square Recruitment
Apply for the Developer
Location: London
Job Description:

The Role

You will be part of a specialist engineering team responsible for designing, building, and optimising end-to-end financial instrument mastering pipelines. These pipelines span ingestion, normalisation, bi-temporal processing, and publication into enterprise data platforms.

You will work closely with data architects, domain experts, and QC engineers to deliver scalable, reliable, and high-performance data solutions across Azure and Microsoft Fabric ecosystems.

Key Responsibilities

  • Build and maintain PySpark-based data pipelines for financial instrument mastering across multiple data sources
  • Design and implement bi-temporal data processing models (system time + valid time) including Slice, Resolve, Coalesce, and Diff logic
  • Develop optimised Azure Cosmos DB data models, including partitioning, indexing, change feed processing, and point-read optimisation
  • Integrate external APIs for entity resolution and matching services (PermID / IAAS) with robust retry and batching mechanisms
  • Design publication pipelines to convert bi-temporal data into uni-temporal outputs and publish via Microsoft Fabric / Parquet-based lakehouse architectures
  • Implement data quality frameworks using Great Expectations to ensure accuracy and compliance
  • Build robust unit and integration tests using PyTest for PySpark and Cosmos DB components
  • Support and maintain CI/CD pipelines (GitLab CI) including Python packaging, Artifactory deployment, and ARM-based infrastructure provisioning
  • Work with YAML-driven configuration for mastering rules, schemas, and environment setup
  • Monitor and troubleshoot production pipelines using Eventstream telemetry, KQL, and DataDog observability tools
  • Deliver scalable transformation logic, optimised aggregations, and high-performance data processing workflows
  • Implement data governance controls including data masking, role-based access, and compliance policies
  • Continuously tune and optimise workloads for performance, cost efficiency, and reliability

Required Skills & Experience

  • Strong experience in Python and PySpark (Spark SQL, DataFrame API, Structured Streaming)
  • Hands-on experience building large-scale ETL / streaming data pipelines
  • Experience working with Azure Cosmos DB (NoSQL) including data modelling and performance tuning
  • Strong knowledge of Azure Data Lake Storage (ADLS / OneLake / ABFS)
  • Experience implementing bi-temporal or SCD Type 2 data models
  • Strong understanding of data quality frameworks (e.g., Great Expectations)
  • Experience with CI/CD pipelines (GitLab / Azure DevOps) and automated deployments
  • Strong testing discipline using PyTest, mocking, and integration testing approaches
  • Experience working with YAML/JSON configuration and infrastructure-as-code (ARM templates)
  • Strong understanding of distributed data processing and Spark-based architectures
  • Experience working with financial or time-series datasets (market data, reference data, risk data preferred)
  • Strong communication skills and ability to work with cross-functional stakeholders

Desirable Experience

  • Microsoft Fabric (Notebooks, Eventstream, Lakehouses, Spark Job Definitions)
  • Financial instrument/reference data (ISIN, CUSIP, LEI, PermID)
  • Entity resolution / matching systems and enrichment APIs
  • Delta Lake and Change Data Feed (CDF)
  • Cosmos DB performance optimisation (RU tuning, bulk operations, concurrency)
  • Jinja2 templating or code generation approaches
  • SonarQube or similar code quality tooling
  • Monorepo development with modern Python packaging tools (uv / Hatchling)
  • Power BI / semantic modelling experience
  • Knowledge of financial compliance standards (GDPR, SOX)

Technology Stack

Python 3.11+, PySpark 3.5, Spark SQL

Azure Cosmos DB, ADLS, OneLake, Delta Lake, Parquet

Microsoft Fabric (Eventstream, Notebooks, Lakehouse)

Great Expectations, LSEG Data Validation frameworks

GitLab CI/CD, JFrog Artifactory, ARM Templates

DataDog, Eventstream, KQL monitoring

Azure Key Vault, Azure CLI, Fabric APIs

Why Join

  • Work on a global financial markets transformation programme
  • Hands-on with next-generation Azure + Fabric data platforms
  • Exposure to bi-temporal modelling and financial instrument mastering systems
  • High-impact engineering role with modern cloud and streaming architecture
  • Opportunity to work with leading domain and technical experts in a regulated environment

Posted: May 28th, 2026