Company: EPAM Systems, Inc.

Apply for the Lead Data Engineer (Databricks, PySpark)

Location: London

Job Description:

We’re looking for a Lead Data Engineer (Databricks, PySpark) to join our team in London, UK in a hybrid working mode.In this role, you will help shape and deliver next-generation data platforms. You will be hands-on in developing, implementing and optimizing scalable ETL workflows and data pipelines, leveraging the full capabilities of Databricks and modern cloud technologies. You will play a key part in the transition to a robust Lakehouse architecture, working closely with cross-functional teams in an agile environment.This position is ideal for a data engineering leader who enjoys solving complex challenges, mentoring others and working at the forefront of Databricks technology. Experience with any major cloud provider is welcome, but a strong focus on Databricks is essential.ResponsibilitiesDesign, develop and maintain production-grade data applications, reusable frameworks and scalable data pipelines using Databricks, PySpark and Python/ScalaLead the architectural design and modernization of data platforms to a Lakehouse architecture leveraging Databricks-native technologies such as Delta Lake and Unity CatalogDrive advanced Spark performance tuning including handling data skew, optimizing Catalyst optimizer/query execution plans and managing cluster compute and memory efficiency for high-volume workloadsChampion modern software engineering practices within the data ecosystem including CI/CD pipelines, Infrastructure as Code (IaC), rigorous code reviews, automated testing and version controlImplement secure, scalable and highly available data solutions leveraging integrations between Databricks and major cloud services (AWS, Azure or GCP)Architect and support AI-driven data solutions including integrating Large Language Models (LLMs), building Agentic workflows and operationalizing GenAI or machine learning models within Databricks pipelinesAct as a Technical Lead in an agile environment collaborating with architects and product owners to decompose complex business requirements into actionable technical strategies, Epics and User StoriesMentor and upskill engineers fostering a culture of engineering excellence, continuous learning and technical innovationServe as a key technical liaison effectively translating and communicating complex architectural decisions, data concepts and system capabilities to both technical and non-technical stakeholdersRequirementsBachelor’s or Master’s degree in Computer Science, Software Engineering or a related fieldDeep, hands-on proficiency in PySpark with proven ability to tackle advanced performance tuning, data skew handling, memory management and Catalyst optimizer troubleshootingExtensive experience building production workloads on Databricks including knowledge of Databricks Workflows, Delta Lake and Unity Catalog for governance and securityDemonstrable experience designing and migrating to Lakehouse architectures utilizing open table formats such as Delta Lake or Apache IcebergStrong hands-on experience integrating Databricks with native cloud services on AWS, Azure or GCPAdvanced programming skills in Python (Scala is a plus) with strong understanding of object-oriented and functional programming principlesProven track record of applying software engineering standards to data pipelines including CI/CD, Infrastructure as Code (e.g. Terraform), version control (Git) and rigorous code reviewsSolid background in implementing automated testing frameworks and data quality validation within pipelinesProven experience as a Senior or Lead Engineer capable of driving technical strategy, making architectural decisions and decomposing complex solutions into Agile Epics and User StoriesStrong ability to articulate complex technical concepts and trade-offs clearly to both technical peers and non-technical stakeholdersAdvantageous: Official Databricks certifications (e.g. Certified Data Engineer Professional, Spark Developer)Highly desirable: Hands-on experience or strong interest in AI and Agentic workflows including operationalizing LLMs, using frameworks like LangChain or LlamaIndex or leveraging Databricks ML/MosaicML for GenAI applicationsWe offerEPAM Employee Stock Purchase Plan (ESPP)Protection benefits including life assurance, income protection and critical illness coverPrivate medical insurance and dental careEmployee Assistance ProgramCompetitive group pension planCyclescheme, Techscheme and season ticket loansVarious perks such as free Wednesday lunch in-office, on-site massages and regular social eventsLearning and development opportunities including in-house training and coaching, professional certifications, over 22,000 courses on LinkedIn Learning Solutions and much moreIf otherwise eligible, participation in the discretionary annual bonus programIf otherwise eligible and hired into a qualifying level, participation in the discretionary Long-Term Incentive (LTI) ProgramAll benefits and perks are subject to certain eligibility requirements#J-18808-Ljbffr…

Posted: May 21st, 2026