Staff Backend Engineer, Dubbing

Company: Synthesia
Apply for the Staff Backend Engineer, Dubbing
Location: London
Job Description:

About the role

You will work on the engineering systems powering Synthesia’s dubbing product, the multi-step pipeline that transforms existing videos into new-language versions while preserving lip sync, voice quality, timing, and overall video integrity.

Your role centers on the core challenge: building a production system that orchestrates complex, long-running jobs (often taking tens of minutes to hours) with reliability, observability, and quality at every stage. You’ll ensure that localized videos are indistinguishable from originals, working across transcription, speaker identification, translation, voice synthesis, and video rendering.

You will be responsible for designing and evolving systems that handle:

  • End-to-end pipeline orchestration for long-running, multi-stage jobs
  • Quality layers across transcription accuracy, speaker diarization, lip-sync rendering, translation, voice cloning, and TTS
  • Integration of ML-driven components (providers and open-source models) into production workflows
  • Video and audio complexity (normalization, chunking, encoding, vocal separation, retiming)
  • Evaluation frameworks that prove measurable improvements in output quality

You will own projects that span multiple systems and domains, such as:

  • Building robustness layers (retries, idempotency, failure recovery) for long-running pipelines
  • Designing persistence and state management to ensure consistent voice outputs across regenerations
  • Improving how video and audio data is processed, cached, and reused
  • Integrating new transcription, translation, voice synthesis, and video rendering providers
  • Building evaluation harnesses around each pipeline stage to measure quality reliably

You will evaluate your work through system performance, user experience metrics, and observability, using tracing and debugging tools to identify bottlenecks and continuously improve reliability.

You will collaborate closely with product, frontend, and ML/R&D teams, ensuring backend systems support both current product needs and future innovation in video localization.

What we’re looking for

Must-haves

  • Strong production backend engineering fundamentals (design, reliability, performance, maintainability)
  • Experience building and operating async, batch, or long-running workflow systems (jobs, retries, failure modes, observability)
  • Comfort operating in ambiguity and making trade-off decisions (quality vs cost vs speed)
  • Enough ML literacy to integrate, evaluate, and iterate on models and third‑party providers (not necessarily an MLE)
  • A product mindset focused on solving user-facing problems from a backend perspective

Nice-to-haves

  • Video, audio, or media pipeline experience (codec, fps, ffmpeg-like realities)
  • Shipped systems that integrate ML outputs into product-facing workflows
  • Built evaluation frameworks for quality (both offline testing and production monitoring)
  • Experience with observability tools (e.g., Datadog), workflow systems (e.g., Temporal), or recommendation/evaluation systems
  • Willingness to step outside your comfort zone—including jumping into frontend code to debug end-to-end flows

Benefits and Location

Our preference is for this role to be based either in‑office or remote in the following locations: UK, Germany, Switzerland or Ireland. We may also be able to support remote workers in other locations across Europe subject to compliance and right-to-work checks.

This is full-time employment only—no contractors possible—usually through OysterHR or a local entity.

Everyone at Synthesia gets 25 days of leave plus local holidays.

#J-18808-Ljbffr…

Posted: June 15th, 2026