Nathaniel Forde
Nathaniel Forde - Experienced Data Scientist / Statistician
Profile
Expert Statistician/Data Scientist with 10+ years' experience delivering ML products — ranging across probabilistic modelling to causal inference in high-growth tech and regulated industries. Skilled in leading teams, scaling data platforms, and translating R&D into business impact. Open-source contributor (PyMC, Bambi, CausalPy, PyMC-Marketing) and experienced in experimentation strategy, MLOps, and customer-facing analytics.
Core Skills
- ML / Statistics: Probabilistic AI, Causal inference, Bayesian methods, Time-series, Survival analysis, Factor analysis, Deep Learning, Survey Analysis, Marketing Analytics
- MLOps & Cloud: AWS (SageMaker, Lambda, ECS, S3), ML pipelines, CI/CD, Docker, Spark, Statsig, Snowflake-ML
- Data Tools: scikit-learn, PyMC, Pandas, dbt, PyTorch, Bambi, CausalPy, statsmodels
- Leadership: Team growth (0→9), cross-functional influence, experimentation strategy, mentoring
Experience
Staff Data Scientist
— PandaDocs, Dublin
August 2026 – Present
- Leading on causal inference and experimentation strategy across the product led growth initiative.
Principal Data Scientist
— PyMC Labs (Part-time Consulting)
2024 -Present
- Advising on applied Bayesian statistics, causal inference, and probabilistic machine learning for enterprise projects.
- Multiple +1M$ client engagements and policy recommendations across different retail sectors and markets.
Staff Data Scientist
— Personio (HR SaaS), Dublin
April 2024 – August 2026
- Scaled the data science org from 0→9 scientists, establishing hiring processes, best practices, and technical roadmaps.
- Built and designed Proactive Insights and Engagement Survey products, driving product stickiness and upsell opportunities.
- Designed and rolled out in-house experimentation framework; A/B testing adoption increased 3×, and test velocity improved 40% YoY after Statsig integration.
- Led long-term forecasting and MMM models (revenue/costs) that improved planning accuracy by 5%, directly impacting company OKRs.
- Mentored junior data scientists and set standards for documentation, modelling, and productionization of ML workflows. Adopted Manager role during Paternity cover period.
Senior Data Scientist
— Personio (HR SaaS), Dublin
Nov 2021 – April 2024
- Built Personio's first end-to-end ML deployment pipeline (Python/AWS), now serving 10+ live models.
- Developed survival models (time-to-failure) used for reliability prioritization; reduced error by 5%.
- Performed root-cause analytics with VAR models to diagnose UX latency issues, resulting in 35% improvement in customer-facing performance metrics.
- Led experimentation with LLMs and NLP pipelines for customer feedback analysis, clustering customer sentiments at scale via latent space embeddings.
Data Scientist (R&D)
— CarTrawler (E-commerce), Dublin
Nov 2019 – 2021
- Designed dynamic pricing and demand forecasting models improving margin 4% and reducing loss ratios 8% YoY.
- Built customer segmentation based on price sensitivity that fed into real-time pricing strategies.
- Developed interlocking claim-propensity and loss-ratio forecasting models for insurance products.
- Published work on Discrete Choice modelling for consumer preference.
Lead Insights Analyst
— Paddy Power/Betfair (Gaming), Dublin
Feb 2018 – 2019
- Created NLP models predicting sentiment on customer chat data during peak usage periods like Cheltenham
- Built propensity models for Responsible Gambling interventions, influencing €5M in budget allocation.
- Partnered with industry think tank on predictive models for problem gambling; findings shaped regulatory strategy.
- Team Lead and Supervisor to Junior Analysts; responsibilities included annual planning and performance reviews.
Software Developer
— Marsh & McLennan Innovation Centre (Insurance), Dublin
Mar 2017 – 2018
- Developed web-apps (Go/Postgres/React) to deliver MCMC risk prediction models (earthquakes, floods, fires) to underwriters.
Data Analyst
— Marsh & McLennan Innovation Centre (Insurance), Dublin
Mar 2014 – 2017
- Built statistical metrics and dashboards improving data quality 15% across 20+ global business units.
- Developed CRUD app (Shiny) for customer QA and spearheaded the company's first data quality framework.
Selected Publications & Open Source
Bayesian Vector AutoRegressive Models of Irish GDP
— PyMC, Open Source
Dec 2022
Developed hierarchical Bayesian VAR & forecasting models. View on PyMC
Justifying Instruments in IV Designs
— CausalPy, Open Source
Jun 2024
Core documentation on instrumental variable methods. View on CausalPy
Education
MSc Logic
— University of Amsterdam (ILLC), Netherlands
2011 – 2013
- Research thesis: models of causal inference and dependency relations.
MPhil (Research) Philosophy
— Trinity College Dublin
2009 – 2011
- Modal logics & identity; awarded IRCHSS Scholarship.
BA International
— UCD / Sorbonne Université, Dublin & Paris
2005 – 2009
Leadership & Community
PyCon & PyData Speaker
— PyCon Ireland & PyData Berlin
2023 – Present
- Frequent mentor and speaker at PyCon events, Bayesian meetup groups, and internal data science guilds.
Hiring Strategy and People Management
— Personio, Dublin
2021 – Present
- Co-led Personio's data science hiring strategy: designed technical interview frameworks, improving candidate signal and DEI balance. Provided people management support and paternity coverage; including performance management and planning.
Open Source Contributor
— PyMC ecosystem
2021 – Present
- Contributor to open-source probabilistic programming and causal inference libraries with 100k+ total users.