Data Engineer

Company: Gazelle Global
Apply for the Data Engineer
Location: Sheffield
Job Description:

We are Hiring for an experienced Data Engineer to join our team in Sheffield, United Kingdom. The ideal candidate will have Strong Experience in Scala and/or Python for streaming apps, familiarity with testing frameworks and CI for stream processors. • Collaboration: Work closely with Kafka platform/admin teams while focusing on application-layer streaming logic; strong communication and documentation.

Your responsibilities:

  • Design and build Kafka-based streaming applications (Kafka Streams/ksqlDB) in Scala/Python for transformation, enrichment, and routing.
  • Implement end-to-end streaming pipelines: producers, stream processors, and consumers with strong data quality, idempotency, and DLQ patterns.
  • Model topics, schemas, and contracts (Avro/Protobuf/JSON) and maintain backward/forward compatibility.
  • Develop batch/stream interoperability: Spark/Structured Streaming jobs for aggregation, feature generation, and storage in Parquet/ORC.
  • Integrate processed data into analytics/observability platforms (e.g., Splunk) for dashboards, alerting, and proactive insights.
  • Build automated validation, replay, and backfill mechanisms to ensure reliability and SLA adherence.
  • Apply observability to the pipelines themselves (metrics, traces, structured logs) and tune performance/cost.
  • Collaborate with platform/infra teams who handle Kafka admin (brokers, security, ops) while owning application-side streaming logic.
  • Ensure security and compliance for application data paths (authn/z, encryption in transit/at rest, secret management).
  • Document data flows, schemas, and runbooks for streaming services.

Your Profile

Essential skills/knowledge/experience:

  • Kafka application development: Kafka Streams/ksqlDB, producer/consumer patterns, partitioning/serialization, exactly-once/at-least-once semantics.
  • Languages: Strong in Scala and/or Python for streaming apps; familiarity with testing frameworks and CI for stream processors.
  • Schema management: Avro/Protobuf/JSON, schema registry usage, compatibility strategies.
  • Stream/batch processing: Spark (including Structured Streaming), Parquet/ORC, partitioning/bucketing, performance tuning.
  • Data quality and reliability: Idempotent processing, DLQs, replay/backfill, lineage, and SLA-aware designs.
  • Observability: Metrics/tracing/logging for stream apps; integration with downstream dashboards/alerts.
  • Security/compliance: AuthN/Z in clients, TLS/SASL usage, secret management in code/services.
  • Collaboration: Work closely with Kafka platform/admin teams while focusing on application-layer streaming logic; strong communication and documentation.

#J-18808-Ljbffr…

Posted: April 10th, 2026