Senior Data Engineer Kafka and Hadoop Expert(Python)

Company: Smartedge Solutions
Apply for the Senior Data Engineer Kafka and Hadoop Expert(Python)
Location: Manchester
Job Description:

Overview

Smartedge’s Client is looking for an individual to help with their Data Engineer Kafka and Hadoop Expert (Python) on a contract role in Sheffield, UK (Hybrid).

Job Summary: Design and build Kafka-based streaming applications (Kafka Streams/ksqlDB) in Scala/Python for transformation, enrichment, and routing; implement end‑to‑end streaming pipelines — producers, stream processors, and consumers — with strong data quality, idempotency, and DLQ patterns; model topics, schemas, and contracts (Avro/Protobuf/JSON) and maintain backward and forward compatibility; develop batch/stream interoperability with Spark/Structured Streaming for aggregation, feature generation, and storage in Parquet/ORC; integrate processed data into analytics/observability platforms (e.g., Splunk) for dashboards, alerting, and proactive insights; build automated validation, replay, and backfill mechanisms to ensure reliability and SLA adherence; apply observability to the pipelines themselves (metrics, traces, structured logs) and tune performance and cost; collaborate with platform/infra teams handling Kafka administration (brokers, security, ops) while owning application‑side streaming logic; ensure security and compliance for application data paths (authn/z, encryption in transit/at rest, secret management); and document data flows, schemas, and runbooks for streaming services.

Responsibilities

  • Lead Kafka application development using Kafka Streams/ksqlDB, selecting appropriate producer/consumer patterns, partitioning/serialization strategies, and exactly‑on‑ce at‑least‑on‑ce semantics.
  • Design, develop, and maintain batch/stream interoperability solutions with Spark, Structured Streaming, Parquet, and ORC for feature generation and data storage.
  • Implement observability across pipelines, generating metrics, traces, and structured logs for dashboards, alerting, and proactive insights.
  • Develop automated validation, replay, and backfill mechanisms to guarantee reliability and SLA compliance.
  • Maintain data quality and reliability by enforcing idempotent processing, DLQs, replay/backfill strategies, lineage, and SLA‑aware designs.
  • Ensure compliance with security standards, including AuthN/Z, TLS/SASL, encryption in transit and at rest, and secret management.
  • Collaborate closely with Kafka platform and infrastructure teams, while retaining ownership of application‑side streaming logic.
  • Document data flows, schemas, and runbooks for all streaming services.

Qualifications

  • Proficiency in Scala and/or Python for streaming application development.
  • Experience with testing frameworks and CI/CD practices for stream processors.
  • Strong knowledge of schema management (Avro, Protobuf, JSON), schema registry usage, and compatibility strategies.
  • Expertise in stream and batch processing with Spark (including Structured Streaming), Parquet/ORC, partitioning/bucketing, and performance tuning.
  • Solid background in data quality and reliability, including idempotency, DLQ patterns, replay/backfill, lineage tracking, and SLA‑aware designs.
  • Hands‑on experience with observability tools—metrics, tracing, and structured logging—for stream applications.
  • Familiarity with security and compliance requirements for streaming platforms (AuthN/Z, TLS/SASL, encryption, secret management).
  • Strong communication and documentation skills to work effectively with platform and admin teams.

#J-18808-Ljbffr…

Posted: April 10th, 2026