Overview
Time commitment: ~3–4 days per week, variable by phase. Duration: period tbc. Start-asap
OCH's global research team conducts large-scale, multi-country survey research and has developed a growing library of quantitative datasets and segmentation outputs across geographies. We are looking for an experienced quantitative analyst to join the team and contribute across a range of analytical work — from foundational data preparation and exploration through to advanced statistical modelling.
The core analytical focus of the role centres on two interconnected workstreams: the rigorous development of survey-based clustering and segmentation models, and the design of a classification framework that allows new respondents to be assigned to existing segments efficiently and reliably. Beyond this, the analyst will also handle day-to-day data management tasks including dataset cleaning, variable harmonisation, and exploratory cross-tabulation work. The role sits within the research methods function and involves close collaboration with OCH's Head of Data & Research Methods.
Responsibilities
- Data cleaning & preparation: Clean, recode, and structure incoming survey datasets - including applying advanced data quality checks and filters, raking & weighing, missing data, etc.
- Conduct foundational data exploration including frequency distributions, cross-tabulations, and basic descriptive analyses, primarily in SPSS
- Work fluently across survey data formats, principally SPSS (.sav) and R-native formats
- Cluster analysis & segmentation: Conduct advanced cluster analysis on complex, multi-country survey datasets, working hand in hand with the Head of Data & Research Methods regarding analytical decisions and final segmentation outputs
- Evaluate and compare clustering approaches (e.g. k-means, hierarchical, latent class analysis, and others as appropriate) with a view to producing segments that are statistically robust, meaningful, and cross-nationally comparable
- Manage the specific methodological challenges of complex survey data: dealing with varying variable types (nominal, ordinal, continuous), handling of translated or culturally non-equivalent items
- Iteratively test and refine cluster solutions, systematically varying parameters and documenting the impact of each decision on outputs
- Classification model development: Using existing, labelled segmentation outputs as a training base, design and fit (machine learning / train-test) an appropriate classification model to enable assignment of new respondents to established segments
- Evaluate candidate classification approaches (e.g. random forest, logistic regression, LDA, gradient boosting, or others) and select the most appropriate given the data structure, segment separability, and intended use
- Assess model performance rigorously using appropriate validation strategies (e.g. cross-validation, held-out test sets, confusion matrices, precision/recall)
- Iterate on model specifications, documenting all variations and intermediary outputs
- 'Golden questions' identification: Identify the minimum set of survey questions that are most predictive of segment membership — i.e., those that would need to be included in future quantitative research instruments to allow reliable classification?
- Apply appropriate variable importance and feature selection techniques to identify and rank candidate questions, and validate their predictive power
- Produce clear recommendations on the golden question set, including supporting evidence and sensitivity analyses
- Classification / calculator tool: Design and implement a practical classification tool or calculator that can be applied to future survey datasets to assign respondents to segments based on the golden question set
- Ensure the tool is well-documented, reproducible, and usable by the Head of Data & Research Methods without requiring re-running of the full modelling pipeline
- Methodological documentation: Maintain detailed records of all analytical iterations, including variations in parameters, model specifications, and the rationale behind decisions taken
- Document all intermediary outputs in a structured and retrievable format
- Produce final methodological documents for each workstream — written to a standard that would allow a qualified analyst to understand, reproduce, and build upon the work
- Flag methodological uncertainties or trade-offs explicitly, rather than presenting a single opaque output
Required Expertise & Experience
- Solid, demonstrable experience (typically 4–7 years) working with quantitative survey or polling data (or equivalent) in an analytical capacity
- Fluency with SPSS for data cleaning, cross-tabulation, and exploratory data analysis, including confident management of variable and value labels, codebooks, and data transformations
- Advanced proficiency in cluster analysis methods, with hands-on experience selecting and comparing approaches on real survey datasets
- Proven experience fitting and validating classification models using labelled training data
- Advanced R proficiency — all modelling and classification work is expected to be conducted in R, with clean, documented, reproducible scripts
- A rigorous, structured approach to analytical work with a strong documentation habit
Key Skills & Attributes
- Statistically rigorous and methodologically confident, with the seniority to take end-to-end ownership of complex analytical problems
- Detail-oriented and systematic, with a natural inclination to document decisions and iterations thoroughly
- Comfortable working autonomously and at depth on a focused analytical brief
- Able to communicate methodological choices clearly in writing, for a technically informed audience
- Self-directed, structured, and reliable in managing their own workflow
#J-18808-Ljbffr