Job Details
We are seeking a Data Scientist to play a central role in transforming raw data into structured, reliable, and scalable datasets within our Data Hub environment. In this role, you will design and maintain data-processing pipelines, collaborate closely with domain experts and labeling teams, and contribute to machine learning initiatives that drive value across the organization.
This position offers the opportunity to work at the intersection of data engineering, domain knowledge, and machine learning—supporting innovative solutions while shaping our evolving data ecosystem.
Key Responsibilities
- Develop systematic data‑transformation modules to gather, clean, validate, and structure raw datasets.
- Collaborate closely with Subject Matter Experts (SMEs) and the labeling team, building a solid understanding of domain knowledge and requirements.
- Support SMEs and the labeling team by providing tools, solutions, and continuous technical feedback to improve annotation workflows.
- Build scalable, reusable data‑processing solutions and maintain version control using GitLab.
- Troubleshoot and diagnose data issues, proactively flagging inconsistencies or risks.
- Maintain a deep understanding of the Data Hub technology stack and data schemas, staying current with emerging technologies, models, and research.
- Contribute to broader initiatives related to data integration, feature design, and machine learning.
- Design and run experiments, iterate based on results, and document and share learnings with the team.
- Use pre‑trained ML models, and train or fine‑tune models on internal datasets with proper evaluation and validation.
- Communicate regularly with team members, the team lead, and cross‑functional partners in production and technology to ensure alignment and transparency.
Performance Metrics
- Deliver high‑quality work on time.
- Quickly learn and apply new tools, technologies, and concepts.
- Demonstrate a strong understanding of Data Hub systems and workflows.
Qualifications & Skills
- Background in data science or a related field (Master’s degree preferred).
- Proficient in at least one programming language—ideally Python—with experience in common ML libraries.
- Experience with hybrid ML workflows (traditional ML, LLMs, embeddings, ontologies, knowledge graphs).
- Comfortable working with relational, NoSQL, and graph databases.
- Strong data‑processing skills, including cleaning, filtering, and feature extraction.
- Clear communicator with strong collaboration and presentation skills.
Benefits
- Competitive salary commensurate with experience.
- Highly attractive bonus scheme.
- Hybrid model and flexible working with up to 2 days at home.
- Initial 22 days annual leave with future increases, complemented by a flexible buying and selling holiday program.
- Company pension with generous employer contribution.
- Wellbeing Unmind app – puts you in control of your mental health.
- A flexible benefits platform with numerous discount schemes – gym membership, restaurants, cinema tickets, and much more.
- Regular social club events, spontaneous reward events throughout the year.
- Cycle purchase scheme.
- Flexible Private Medical & Dental care programmes.
- Sponsorship of visas/comprehensive relocation packages.
- Bank Holiday Swap – our holiday swap program allows you to change it for another day of your choice.
- Relaxed dress code policy.
We value diversity and are committed to equal employment opportunities for all professionals.
#J-18808-Ljbffr