Research Engineer - Contextual Bandits & RL

Company: algo1

Location: London

Posted: April 17th, 2026

About Us

We are a VC-backed startup focused on hyper-personalisation, currently in stealth. Inspired by the latest in recommender systems, we leverage transformers and graph learning alongside decision‑making models to build the most engaging customer experiences for in‑store retail.

Our mission is to change retail forever through hyper‑personalised experiences that are both simple and beautiful.

About the Role - Offline Contextual Bandits and RL for Hyper-personalisation

We are looking for a Research Engineer to build decision‑making models for in‑store hyper‑personalisation, with an initial focus on learning from logged human interaction data in an offline setting. You will work closely with domain experts and engineers to develop contextual bandit and reinforcement learning approaches that can support both single‑step decisions and multi‑step customer journeys, with the potential to enable online learning over time.

Key Responsibilities

Essential Qualifications

Desired Skills (Bonus Points)

What We Offer

If you’re excited by the idea of shaping the future of retail and eager to make a real impact from day one, we’d love to hear from you.

#J-18808-Ljbffr
Apply Now