Company: CoreWeave

Apply for the Data Scientist

Location: Greater London

Job Description:

CoreWeave is the essential cloud for AI, delivering technology, tools, and teams that enable innovators to build and scale AI with confidence. Founded in 2017 and publicly traded (Nasdaq: CRWV) in March 2025, CoreWeave is trusted by leading AI labs, startups, and global enterprises.

What You’ll Do

The Monolith Data Science team is building a layered reliability platform that shifts CoreWeave from reactive troubleshooting to proactive reliability engineering. The platform spans telemetry ingestion, feature engineering, anomaly detection, failure prediction, distributed straggler detection, and agentic root cause analysis. You will partner closely with Fleet, Infrastructure, and AI Platform teams to improve cluster reliability, increase effective utilization (MFU), reduce MTTR, and protect uptime and revenue.

About The Role

As a Data Science Researcher, you will develop advanced statistical models and machine learning methodologies to optimize GPU utilization, workload scheduling, and infrastructure efficiency. You will design experiments, analyze large‑scale system telemetry data, and prototype predictive and optimization algorithms that directly inform production systems. This role blends research rigor with real‑world impact, turning complex infrastructure data into measurable improvements in performance and cost, and you will collaborate cross‑functionally to translate research insights into deployable solutions.

Who You Are

MS or PhD in Computer Science, Statistics, Applied Mathematics, Machine Learning, or related quantitative field
8+ years (or equivalent research experience) applying statistical modeling or machine learning to large‑scale datasets
Strong proficiency in Python and scientific computing libraries (NumPy, pandas, SciPy, scikit‑learn, PyTorch or TensorFlow)
Demonstrated experience designing and analyzing controlled experiments (A/B testing, causal inference, hypothesis testing)
Experience working with distributed data systems (Spark, Ray, Dask, or similar)
Proficiency in SQL and working with large‑scale structured datasets
Experience building and validating predictive models in production or research environments
Strong understanding of optimization techniques (linear programming, convex optimization, stochastic optimization, or reinforcement learning)
Experience with time‑series data and performance telemetry
Ability to translate research findings into production‑ready prototypes

Preferred

PhD with published research in systems optimization, distributed computing, ML systems, or performance modeling
Experience with GPU workloads, distributed training, or AI infrastructure
Familiarity with Kubernetes, containerized workloads, or cloud‑native systems
Experience developing reinforcement learning or adaptive scheduling systems
Background in capacity planning, forecasting, or resource allocation modeling
Contributions to open‑source ML or systems projects

What We Offer

Family‑level Medical Insurance
Family‑level Dental Insurance
Generous Pension Contribution
Life Assurance at 4× Salary
Critical Illness Cover
Employee Assistance Programme
Tuition Reimbursement
Work culture focused on innovative disruption

Workplace

While we prioritize a hybrid work environment, remote work may be considered for candidates located more than 30 miles from an office, based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month.

EEO Statement

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

#J-18808-Ljbffr…

Posted: April 17th, 2026