Missing Data Imputation in PyMC

probability
pymc
missing_data
imputation
Author

Nathaniel Forde

Published

February 10, 2023

Missing Data Imputation and Employee Survey Data

In this project I demonstrate the technique of imputation for missing data using both the standard frequentist approach full information maximum likelihod and a more nuanced Bayesian method of chained equations.

The project culminated in a publication to the official PyMC documentation that can be found online here or downloaded as notebook here

The notebook demonstates these techniques applied to employee satisfaction data. In particular we show how Bayesian hierarchical methods can be used to help predict missing data values across various teams within an organisation based on the observed values andt the characteristics of the team dynamics which drove the observed data.

Deviations from the Grand Mean by Team