Using a customer data set (10,000 customers), an analyst in a bank built a churn
model with logistic regression. The analyst first produced a frequency table for the
variable Churn and a descriptive statistics table for some variables. The analyst then
fine-tuned the logistic regression model.
• Could you explain what data and variables have been included in the logistic
regression model?
• Which of those input variables are categorical variables?
• What (and why) the analyst has done to fine-tune the model?
• Could you explain the final model (e.g. what are the significant variables,
relationship between input variables and churn, the odds ratios)? Do you think
there is anything else could be try to improve the model?