About Cint
Cint is a pioneer in research technology (ResTech). Our customers use the Cint platform to post questions and get answers from real people to build business strategies, confidently publish research, accurately measure the impact of digital advertising, and more. The Cint platform is built on a programmatic marketplace, which is the world’s largest, with nearly 300 million respondents in over 150 countries who consent to sharing their opinions, motivations, and behaviours.
Job Description
As a Data Scientist at Cint, you will play a pivotal role in developing next‑generation AI solutions that power our product portfolio. Collaborating closely with Product and Engineering teams, you will bridge the gap between traditional research data and synthetic intelligence. You will focus on the research, validation, and delivery of models—including Large Language Models (LLMs)—that augment high‑quality human signals across the Cint Exchange. This role involves advanced data mining, robust data validation, and the development of sophisticated statistical and machine learning methodologies.
Responsibilities
- Contribute to the research, discovery, and development of machine learning models – specifically focused on synthetic row generation, open‑ended text generation, and data augmentation.
- Execute statistical tests and experiments to validate LLM performance and synthetic modeling hypotheses.
- Develop logic for on‑demand and dynamic boosting capabilities, collaborating with Engineering to integrate these models into Cint Exchange fielding workflows.
- Design and refine sophisticated profiling taxonomies, leveraging large‑scale datasets to create syndicated audiences.
- Manage technical workflows and development cycles with guidance.
- Collaborate with Product and Engineering teams to support integration.
- Create clear, effective prototypes and deliverables that explain and defend complex Generative AI concepts to both technical and non‑technical audiences.
Qualifications Required
- Minimum 2–4 years of experience in a Data Science capacity, with experience delivering end‑to‑end data science solutions.
- A Master’s degree (or equivalent) in Statistics, Data Science, or a related quantitative field.
- Deep understanding of Generative AI and LLMs, particularly for applications in text generation and data synthesis.
- Advanced knowledge of statistical techniques: hypothesis testing, sampling theory, experimental design, and causal inference.
- Strong knowledge of a variety of ML techniques (e.g., clustering, regression, neural networks, etc.) and their real‑world trade‑offs.
- Expert proficiency in Python (DS/ML stack) and experience with frameworks used for LLM development and fine‑tuning.
- Advanced SQL skills and experience working with large‑scale databases.
- Ability to research and adopt new methods.
Essential Qualities
- Highly accountable self‑starter and quick learner, consistently motivated to deliver high‑quality, impactful results.
- Strong data‑driven mindset with the ability to translate abstract business requests into actionable AI initiatives and solutions.
- Excellent written and verbal communication skills, with the ability to communicate technical findings clearly.
Nice to Have
- Direct experience with Synthetic Data Generation techniques and the evaluation of synthetic data quality/utility.
- Experience with Prompt Engineering, RAG (Retrieval‑Augmented Generation), or fine‑tuning open‑source LLMs for open‑end generation.
- Experience with probabilistic modeling, or advanced profiling techniques.
- Familiarity with online market research or survey exchange platforms.
- Experience using Databricks, Spark, or PySpark for large‑scale workflows.
#J-18808-Ljbffr…
