About the Role
As an Applied Scientist at Prolific, you will be instrumental in designing and prototyping AI/ML methods that enhance data quality, scale human judgement, and bolster robust AI evaluation workflows. This role is focused on applied problems such as quality modelling, judgement aggregation, evaluation design, LLM-assisted review, and reliability testing for AI systems. It is ideal for individuals with profound scientific judgement, strong applied ML skills, and a pragmatic approach to methods that are effective in real customer and product environments. This position is distinct from pure research or production ML engineering; you will transform ambiguous challenges into clear methodologies, benchmarks, models, and prototypes for adoption by product and engineering teams.
What You’ll Be Doing
- Prototype AI/ML methods to improve human data quality, judgement aggregation, and AI evaluation workflows.
- Design experiments, benchmarks, and reliability tests to measure the effectiveness of new methods in enhancing quality, efficiency, or customer outcomes.
- Apply classical ML, statistics, LLMs, and agentic techniques where they offer practical value.
- Utilise modern AI tools to accelerate prototyping, experimentation, and iteration.
- Collaborate closely with product and engineering teams to translate scientific methods into scalable platform capabilities.
- Communicate technical assumptions, trade-offs, and recommendations clearly to both technical and non-technical teams.
What We are looking for
- PhD or MSc in Computer Science, Mathematics, Statistics, Machine Learning, or a related field.
- 3+ years of experience in applied ML, AI research, or data science with a proven track record of real-world impact.
- Experience with human-in-the-loop AI systems, including RLHF, annotation pipelines, data quality modelling, judgement aggregation, benchmarks, or AI evaluation.
- Fluency with modern LLM and agentic techniques, such as Retrieval-Augmented Generation (RAG), LLM-as-judge, multi-agent workflows, synthetic data generation, and automated quality review.
- Strong Python skills and the ability to rapidly build, test, and iterate on working prototypes.
- Sound judgement in selecting appropriate methods, whether simple statistical, classical ML, LLMs, or agentic approaches.
- Ability to translate ambiguous product or customer problems into clear hypotheses, experiments, metrics, and reusable methodologies.
- Strong cross-functional communication and experience partnering with product and engineering teams.
#J-18808-Ljbffr…
