Company: Gravity Engineering Services Pvt Ltd.

Apply for the Lead Applied Scientist

Location: London

Job Description:

About the Role

As an Applied Scientist at Prolific, you will be instrumental in designing and prototyping AI/ML methods that enhance data quality, scale human judgement, and bolster robust AI evaluation workflows. This role is focused on applied problems such as quality modelling, judgement aggregation, evaluation design, LLM-assisted review, and reliability testing for AI systems. It is ideal for individuals with profound scientific judgement, strong applied ML skills, and a pragmatic approach to methods that are effective in real customer and product environments. This position is distinct from pure research or production ML engineering; you will transform ambiguous challenges into clear methodologies, benchmarks, models, and prototypes for adoption by product and engineering teams.

What You’ll Be Doing

Prototype AI/ML methods to improve human data quality, judgement aggregation, and AI evaluation workflows.
Design experiments, benchmarks, and reliability tests to measure the effectiveness of new methods in enhancing quality, efficiency, or customer outcomes.
Apply classical ML, statistics, LLMs, and agentic techniques where they offer practical value.
Utilise modern AI tools to accelerate prototyping, experimentation, and iteration.
Collaborate closely with product and engineering teams to translate scientific methods into scalable platform capabilities.
Communicate technical assumptions, trade-offs, and recommendations clearly to both technical and non-technical teams.

What We are looking for

PhD or MSc in Computer Science, Mathematics, Statistics, Machine Learning, or a related field.
3+ years of experience in applied ML, AI research, or data science with a proven track record of real-world impact.
Experience with human-in-the-loop AI systems, including RLHF, annotation pipelines, data quality modelling, judgement aggregation, benchmarks, or AI evaluation.
Fluency with modern LLM and agentic techniques, such as Retrieval-Augmented Generation (RAG), LLM-as-judge, multi-agent workflows, synthetic data generation, and automated quality review.
Strong Python skills and the ability to rapidly build, test, and iterate on working prototypes.
Sound judgement in selecting appropriate methods, whether simple statistical, classical ML, LLMs, or agentic approaches.
Ability to translate ambiguous product or customer problems into clear hypotheses, experiments, metrics, and reusable methodologies.
Strong cross-functional communication and experience partnering with product and engineering teams.

#J-18808-Ljbffr…

Posted: June 15th, 2026