A Choose-Your-Own-Adventure in Bayesian Consumer Choice Modeling
Who am I?
Code or it didn’t Happen
The worked examples used here can be found here
My Website
Consumer choice is everywhere — from cereal to cars to climate systems. Business success hinges on understanding these choices: pricing, product design, market segmentation.
Classic models fall short with unrealistic assumptions and pricing experiments can be costly.
Use better models with PyMC-Marketing and simulate pricing interventions safely.
Inference: What is the most plausible world given the data?
\[ p(\theta_{w_{i}} | Y) = \dfrac{p(\theta_{w_{i}})p(Y | \theta_{i})}{\sum_{j}^{N} p(\theta_{w_j})p(Y | \theta_{w_j}) }\]
Counterfactual Inference: What plausibly happens in nearby worlds?
\(\mathbf{\theta_{w_{1}}} \rightsquigarrow\)
\(\mathbf{\theta_{w_{2}}} \rightsquigarrow\)
\(\mathbf{\theta_{w_{3}}} \rightsquigarrow\)
\(f(\alpha_{w_1}, \beta_{w_1}^{0}, \beta_{w_1}^{1}) \rightsquigarrow\)
\(f(\alpha_{w_2}, \beta_{w_2}^{0}, \beta_{w_2}^{1}) \rightsquigarrow\)
\(f(\alpha_{w_3}, \beta_{w_3}^{0}, \beta_{w_3}^{1}) \rightsquigarrow\)
The Consumer’s Dilemma: What drives us to Choose in the face such vast Possibility?
“What drives human choice?”
Each purchase tells a story of preference…
The utility function forms the cornerstone of choice modeling:
\[\color{red}U_{ij} = \color{blue}\alpha_{ij} + \color{blue}\beta_{ij}^{1} \color{black}\cdot X_{ij}^{1} + \color{blue} \beta_{ij}^{2} \color{black}\cdot X_{ij}^{2} \]
\[P_{ij} = \frac{\exp(\color{red}U_{ij})}{\sum_{k=1}^{J} \exp(\color{red}U_{ik})} \]
Where:
Do you value the company of others? Do you fear it? What about the average cave dweller?
\[ u(\text{Light shaft + Silence}) - u(\text{Glowing Fire + Conversational Echoes}) > 0?\]
Choice Scenarios specified with attributes and choice outcomes for each discrete alternative
\[ \begin{split} \begin{split} \begin{pmatrix} u_{gc} \\ u_{gr} \\ u_{ec} \\ u_{er} \\ u_{hp} \\ \end{pmatrix} = \begin{pmatrix} gc_{ic} & gc_{oc} \\ gr_{ic} & gr_{oc} \\ ec_{ic} & ec_{oc} \\ er_{ic} & er_{oc} \\ hp_{ic} & hp_{oc} \\ \end{pmatrix} \begin{pmatrix} \color{blue}\beta_{ic} \\ \color{blue}\beta_{oc} \\ \end{pmatrix} \end{split} \end{split} \]
The probability of choosing alternative \(j\) follows the elegant logistic form:
\[\frac{\exp(\color{red}U_{ij})}{\sum_{k=1}^{J} \exp(\color{red}U_{ik})} = P_{ij} \Rightarrow s_{j}(\color{blue}\theta_{w})=P(u_{j}>u_{k};\forall_{k̸=j})\]
utility_formulas = [
"gc ~ ic_gc + oc_gc | income + rooms + agehed",
"gr ~ ic_gr + oc_gr | income + rooms + agehed",
"ec ~ ic_ec + oc_ec | income + rooms + agehed",
"er ~ ic_er + oc_er | income + rooms + agehed",
"hp ~ ic_hp + oc_hp | income + rooms + agehed",
]
mnl = MNLogit(df, utility_formulas, "depvar", covariates=["ic", "oc"])
mnl.sample()
The Multinomial Logit enforces the Indepdence of Irrelevant Alternatives property into preference calculations.
\[\dfrac{P_{j}}{P_{i}} = \dfrac{ \dfrac{e^{U_{j}}}{\sum_{i}^{n}e^{U_{k}}}}{\dfrac{e^{U_{i}}}{\sum_{i}^{n}e^{U_{k}}}} = \dfrac{e^{U_{j}}}{e^{U_{i}}} = e^{U_{j} - U_{k}}\]
Key Take-away: The Model Ignores Market Structure
new_policy_df = df.copy()
new_policy_df[["ic_ec", "ic_er"]] = new_policy_df[["ic_ec", "ic_er"]] * 1.5
## Posterior Predictive Forecast under counterfactual setting
idata_new_policy = mnl.apply_intervention(new_choice_df=new_policy_df)
## Compare Old and New Policy Settings
change_df = mnl.calculate_share_change(mnl.idata, mnl.intervention_idata)
change_df
\(U = Y + W\)
\(P(i) \text{ when } i \in Alts\)
\(P(\text{choose nest B}) \cdot P(\text{choose i} | \text{ i} \in \text{B})\)
\(P(\text{choose nest B}) = \dfrac{e^{W + \lambda_{k}I_{k}}}{\sum_{l=1}^{K} e^{W + \lambda_{l}I_{l}}}\)
\(P(\text{choose i} | \text{ i} \in \text{B}) = \dfrac{e^{Y_{i} / \lambda_{k}}}{\sum_{j \in B_{k}} e^{Y_{j} / \lambda_{k}}}\)
\(I_{k} = ln \sum_{j \in B_{k}} e^{Y_{j} / \lambda_{k}} \\ \text{ and } \lambda_{k} \sim Beta(1, 1)\)
The log-sum component allows for the utility of any alternatives within a nest to “bubble up” and influence the attractiveness of the overall nest.
utility_formulas = [
"gc ~ ic_gc + oc_gc | income + rooms ",
"ec ~ ic_ec + oc_ec | income + rooms ",
"gr ~ ic_gr + oc_gr | income + rooms ",
"er ~ ic_er + oc_er | income + rooms ",
"hp ~ ic_hp + oc_hp | income + rooms ",
]
nesting_structure = {"central": ["gc", "ec"], "room": ["hp", "gr", "er"]}
nstL_1 = NestedLogit(
df,
utility_formulas,
"depvar",
covariates=["ic", "oc"],
nesting_structure=nesting_structure,
model_config={
"alphas_": Prior("Normal", mu=0, sigma=5, dims="alts"),
"betas": Prior("Normal", mu=0, sigma=1, dims="alt_covariates"),
"betas_fixed_": Prior("Normal", mu=0, sigma=1, dims="fixed_covariates"),
"lambdas_nests": Prior("Beta", alpha=2, beta=2, dims="nests"),
},
)
nstL_1
The relative importance of product attributes implied by our observed data
The relative importance of installation costs versus operating costs might suggest where to impose a novel pricing strategy?
new_policy_df = df.copy()
new_policy_df[["ic_ec", "ic_er"]] = new_policy_df[["ic_ec", "ic_er"]] * 1.5
idata_new_policy_1 = nstL_1.apply_intervention(new_choice_df=new_policy_df)
change_df_1 = nstL_1.calculate_share_change(nstL_1.idata, nstL_1.intervention_idata)
change_df_1
Nested Logit allows for patterns of Non-Proportional Substitution under counterfactual settings
\[ w = \{ \alpha, \beta^{1}, \beta_{2}, X^{1}, X^{2} \} \\ \Rightarrow w^{*} = \{ \alpha, \beta^{*}, \beta_{2}, X^{*}, X^{2} \} \]
with pm.do(
model,
{"X1": np.ones(len(df)),
"beta1": 0.5},
prune_vars=True,
) as counterfactual_model:
idata_trt = pm.sample_posterior_predictive(idata,
var_names=["like", "p"])
Causal Inference with the Do-Operator modifies world-state and data alike.