
LOGISTIC REGRESSION: BINARY & MULTINOMIAL
An illustrated tutorial and introduction to binary and multinomial logistic regression using SPSS, SAS, or Stata for examples. Suitable for introductory graduatelevel study.
The 2016 edition is a major update to the 2014 edition. Among the new features are these:
The full content is now available from Statistical Associates Publishers. Click here.
Below is the unformatted table of contents.
LOGISTIC REGRESSION Table of Contents Overview 10 Data examples 12 Key Terms and Concepts 13 Binary, binomial, and multinomial logistic regression 13 The logistic model 14 The logistic equation 15 Logits and link functions 17 Saving predicted probabilities 19 The dependent variable 20 The dependent reference default in binary logistic regression 21 The dependent reference default in multinomial logistic regression 22 Factors: Declaring 27 Overview 27 SPSS 27 SAS 29 Stata 30 Factors: Reference levels 31 Overview 31 SPSS 32 SAS 33 Stata 35 Covariates 36 Overview 36 SPSS 36 SAS 37 Stata 38 Interaction Terms 38 Overview 38 SPSS 38 SAS 39 Stata 40 Estimation 41 Overview 41 Complex samples 41 Maximum likelihood estimation (ML) 41 Weighted least squares estimation (WLS) 43 Ordinary least squares estimation (OLS) 44 Residuals 44 Overview 44 Residuals vs. distance, influence, and leverage 45 Types of residuals and influence 46 Covariance pattern vs. individual observation residuals 50 Which type of residual to use? 50 Residuals and model misspecification 51 A basic binary logistic regression model in SPSS 51 Example 51 SPSS input 51 SPSS output 54 Parameter estimates and odds ratios 54 Omnibus tests of model coefficients 56 Model summary 56 Classification table 57 Classification plot 59 Probabilities of group membership 61 HosmerLemeshow test of goodness of fit 61 Residual analysis 63 Checking for outliers 68 A basic binary logistic regression model in SAS 74 Example 74 SAS input 74 Reconciling SAS and SPSS output 75 SAS output 76 Parameter estimates 76 Odds ratio estimates 77 Global null hypothesis tests 78 Model fit statistics 79 The classification table 80 The association of predicted probabilities and observed responses table 83 Hosmer and Lemeshow test of goodness of fit 83 Residual analysis 84 A basic binary logistic regression model in STATA 92 Overview and example 92 Data setup 93 Stata input 94 Stata output 94 Parameter estimates 94 Odds ratios 95 Likelihood ratio test of the model 96 Model fit statistics 97 The classification table 98 Classification plot 99 Measures of association 100 HosmerLemeshow test 101 Residual analysis 102 Probability (marginal) analysis for binary logistic regression 111 Overview 111 How probabilities are conditional on covariate values 113 Example 117 Stata 117 The margins and mchange commands 117 The binary logistic model and its odds ratios 118 Adjusted predictions at the means (APM) 120 Marginal effects at means (MEM) 124 Problems with APM and MEM measures 126 Adjusted predictions at representative values (APR) 127 Marginal effects at representative values (MER) 130 Average marginal effects (AME) 135 Average adjusted predictions (AAP) 138 Command summary 140 Interactions in probability analysis 141 Comparison of oddsratio and probabilistic interpretations 143 Common options for marginal analysis in Stata 144 SAS 149 SAS input 149 SAS output 151 SPSS 151 A basic multinomial logistic regression model in SPSS 151 Example 151 Model 152 SPSS statistical output 153 Step summary 155 Model fitting information table 155 Goodness of fit tests 156 Likelihood ratio tests 156 Parameter estimates 157 Pseudo Rsquare 159 Classification table 160 Observed and expected frequencies 160 Asymptotic correlation matrix 160 Residual analysis in multinomial logistic regression 161 A basic multinomial logistic regression model in SAS 161 Example 161 SAS syntax 161 SAS statistical output 162 Overview 162 Model fit 162 Goodness of fit tests 163 Parameter estimates 164 Pseudo RSquare 166 Classification table 166 Observed and predicted functions and residuals 166 Correlation matrix of estimates 167 A basic multinomial logistic regression model in STATA 168 Example 168 Stata data setup 168 Stata syntax 169 Stata statistical output 170 Overview 170 Model fit 170 AIC and BIC 171 Pseudo Rsquare 172 Goodness of fit test 172 Likelihood ratio tests 173 Parameter estimates 173 Odds ratios/ relative risk ratios 174 Classification table 175 Observed and expected frequencies 176 Asymptotic correlation matrix 176 Probability (marginal) analysis for multinomial regression 176 Overview 176 Stata 177 Example 177 Average marginal effects (AME) model for multinomial regression 177 The mchange command 182 SAS 184 SPSS 184 ROC curve analysis 184 Overview 184 Comparing models 185 Example 185 SPSS 186 Comparing models 186 Optimal classification cutting points 191 SAS 195 Overview 195 Comparing Models 197 Optimal classification cutting points 199 Stata 201 Overview 201 Comparing Models 203 Optimal classification cutting points 207 Conditional logistic regression for matched pairs 208 Overview 208 Example 208 Data setup 208 Conditional logistic regression in SPSS 209 Overview 209 SPSS input 210 SPSS output 213 Conditional logistic regression in SAS 215 Overview 215 SAS input 216 SAS output 216 Conditional logistic regression in Stata 218 Overview 218 Stata input 218 Stata output 218 More about parameter estimates and odds ratios 220 For binary logistic regression 220 Example 1 220 Example 2 223 For multinomial logistic regression 226 Example 1 226 Example 2 229 Coefficient significance and correlation significance may differ 231 Reporting odds ratios 231 Comparing the change in odds for different values of X 233 Odds ratios: Summary 233 Effect size 233 Confidence interval on the odds ratio 233 Warning: very high or very low odds ratios 234 Comparing the change in odds when interaction terms are in the model 234 Probabilities, logits, and odds ratios 235 Probabilities 235 Relative risk ratios (RRR) 239 More about significance tests 239 Overview 239 Significance of the model 239 SPSS 239 SAS 243 Stata 243 Significance of parameter effects 243 SPSS 243 SAS 247 Stata 247 Bootstrapped significance 248 What is bootstrapped significance? 248 Bootstrapping vs. jackknifing 249 SPSS 249 SAS 250 Stata 251 More about effect size measures 253 Overview 253 Effect size for the model 253 Pseudo Rsquared 253 Classification tables 255 Terms associated with classification tables: 260 The c statistic 262 Information theory measures of model fit 263 Effect size for parameters 265 Odds ratios 265 Unstandardized logistic coefficients 265 Standardized logistic coefficients 265 Stepwise logistic regression 266 Overview 266 Forward selection vs. backward elimination 268 Crossvalidation 269 Rao's efficient score as a variable entry criterion for forward selection 269 Score statistic 270 Which step is the best model? 271 Contrast Analysis 272 Repeated contrasts 272 Indicator contrasts 272 Contrasts and ordinality 273 Assumptions 274 Data level 274 Meaningful coding 275 Proper specification of the model 275 Independence of irrelevant alternatives 276 Error terms are assumed to be independent (independent sampling) 276 Low error in the explanatory variables 276 Linearity 276 Additivity 278 Absence of perfect separation 278 Absence of perfect multicollinearity 278 Absence of high multicollinearity 279 Centered variables 279 No outliers 279 Sample size 279 Sampling adequacy 280 Expected dispersion 280 Frequently Asked Questions 281 How should logistic regression results be reported? 281 Example 281 Why not just use regression with dichotomous dependents? 282 How does OLS regression compare to logistic regression? 283 What does "controlling for other variables" mean in logistic regression? 284 Why is there no R2 or percent of variance explained in logistic regression? 285 Do regression weights change if variables are added or dropped from the logistic equation? 288 When is discriminant analysis preferred over logistic regression? 288 What is the SPSS syntax for logistic regression? 288 What is the Stata syntax for the logistic command? 291 What is the Stata syntax for the margins command used for probability analysis? 291 What is the Stata syntax for the mchange command used for probability analysis? 293 Apart from indicator coding, what are the other types of contrasts? 295 Will SPSS's binary logistic regression procedure handle my categorical variables automatically? 298 Can I handle missing cases the same in logistic regression as in OLS regression? 299 Explain the error message I am getting about unexpected singularities in the Hessian matrix. 299 Explain the error message I am getting in SPSS about cells with zero frequencies. 300 Is multicollinearity a problem for logistic regression the way it is for multiple linear regression? 300 What is the logistic equivalent to the VIF test for multicollinearity in OLS regression? Can odds ratios be used? 300 How are interaction effects handled in logistic regression? 301 Does Bayesian logistic regression exist? 302 Does stepwise logistic regression exist, as it does for OLS regression? 302 What are the stepwise options in multinomial logistic regression in SPSS? 302 May I use the multinomial logistic option when my dependent variable is binary? 305 What is nonparametric logistic regression and how is it more nonlinear? 306 How many independent variables can I have? 306 What is the logistic regression equation if an independent variable is categorical? 307 How are logit coefficients compared across groups formed by a categorical variable? 307 How do I compute confidence intervals for unstandardized logit (effect) coefficients? 308 Acknowledgments 308 Bibliography 309 Pagecount: 314