Volume II is devoted to generalized linear mixed models for binary, categorical, count, and survival outcomes. The second volume has seven chapters also organized into four parts. The first three parts in volume II cover models for categorical responses, including binary, ordinal, and nominal (a new chapter); models for count data; and models for survival data, including discrete-time and continuous-time (a new chapter) survival responses. The fourth and final part in volume II describes models with nested and crossed-random effects with an emphasis on binary outcomes.
The book has extensive applications of generalized mixed models performed in Stata. Rabe-Hesketh and Skrondal developed gllamm, a Stata program that can fit many latent-variable models, of which the generalized linear mixed model is a special case. As of version 10, Stata contains the xtmixed,xtmelogit, and xtmepoisson commands for fitting multilevel models, in addition to other xt commands for fitting standard random-intercept models. The types of models fit by these commands sometimes overlap; when this happens, the authors highlight the differences in syntax, data organization, and output for the two (or more) commands that can be used to fit the same model. The authors also point out the relative strengths and weaknesses of each command when used to fit the same model, based on considerations such as computational speed, accuracy, available predictions, and available postestimation statistics.
In summary, this book is the most complete, up-to-date depiction of Stata’s capacity for fitting generalized linear mixed models. The authors provide an ideal introduction for Stata users wishing to learn about this powerful data analysis tool.
List of Tables
List of Figures
V Models for categorical responses
10. DICHOTOMOUS OR BINARY RESPONSES
Introduction
Single-level logit and probit regression models for dichotomous responses
Generalized linear model formulation
Latent-response formulation
Logistic regression
Probit regression
Which treatment is best for toenail infection?
Longitudinal data structure
Proportions and fitted population-averaged or marginal probabilities
Random-intercept logistic regression
Model specification
Reduced-form specification
Two-stage formulation
Estimation of random-intercept logistic models
Using xtlogit
Using xtmelogit
Using gllamm
Subject-specific or conditional vs. population-averaged or marginal relationships
Measures of dependence and heterogeneity
Conditional or residual intraclass correlation of the latent responses
Median odds ratio
Measures of association for observed responses at median fixed part of the model
Inference for random-intercept logistic models
Tests and confidence intervals for odds ratios
Tests of variance components
Maximum likelihood estimation
Adaptive quadrature
Some speed and accuracy considerations
Advice for speeding up estimation in gllamm
Assigning values to random effects
Maximum “likelihood” estimation
Empirical Bayes prediction
Empirical Bayes modal prediction
Different kinds of predicted probabilities
Predicted population-averaged or marginal probabilities
Predicted subject-specific probabilities
Predictions for hypothetical subjects: Conditional probabilities
Predictions for the subjects in the sample: Posterior mean probabilities
Other approaches to clustered dichotomous data
Conditional logistic regression
Generalized estimating equations (GEE)
Summary and further reading
Exercises
11. ORDINAL RESPONSES
Introduction
Single-level cumulative models for ordinal responses
Generalized linear model formulation
Latent-response formulation
Proportional odds
Identification
Are antipsychotic drugs effective for patients with schizophrenia?
Longitudinal data structure and graphs
Longitudinal data structure
Plotting cumulative proportions
Plotting cumulative sample logits and transforming the time scale
A single-level proportional odds model
Model specification
Estimation using Stata
A random-intercept proportional odds model
Model specification
Estimation using Stata
Measures of dependence and heterogeneity
Residual intraclass correlation of latent responses
Median odds ratio
A random-coefficient proportional odds model
Model specification
Estimation using gllamm
Different kinds of predicted probabilities
Predicted population-averaged or marginal probabilities
Predicted subject-specific probabilities: Posterior mean
Do experts differ in their grading of student essays?
A random-intercept probit model with grader bias
Model specification
Estimation using gllamm
Including grader-specific measurement error variances
Model specification
Estimation using gllamm
Including grader-specific thresholds
Model specification
Estimation using gllamm
Other link functions
Cumulative complementary log-log model
Continuation-ratio logit model
Adjacent-category logit model
Baseline-category logit and stereotype models
Summary and further reading
Exercises
12. NOMINAL RESPONSES AND DISCRETE CHOICE
Introduction
Single-level models for nominal responses
Multinomial logit models
Conditional logit models
Classical conditional logit models
Conditional logit models also including covariates that vary only over units
Independence from irrelevant alternatives
Utility-maximization formulation
Does marketing affect choice of yogurt?
Single-level conditional logit models
Conditional logit models with alternative-specific intercepts
Multilevel conditional logit models
Preference heterogeneity: Brand-specific random intercepts
Response heterogeneity: Marketing variables with random coefficients
Preference and response heterogeneity
Estimation using gllamm
Estimation using mixlogit
Prediction of random effects and response probabilities
Summary and further reading
Exercises
VI Models for counts
13. COUNTS
Introduction
What are counts?
Counts versus proportions
Counts as aggregated event-history data
Single-level Poisson models for counts
Did the German health-care reform reduce the number of doctor visits?
Longitudinal data structure
Single-level Poisson regression
Model specification
Estimation using Stata
Random-intercept Poisson regression
Model specification
Measures of dependence and heterogeneity
Estimation using Stata
Using xtpoisson
Using xtmepoisson
Using gllamm
Random-coefficient Poisson regression
Model specification
Estimation using Stata
Using xtmepoisson
Using gllamm
Interpretation of estimates
Overdispersion in single-level models
Normally distributed random intercept
Negative binomial models
Mean dispersion or NB2
Constant dispersion or NB1
Quasilikelihood
Level-1 overdispersion in two-level models
Other approaches to two-level count data
Conditional Poisson regression
Conditional negative binomial regression
Generalized estimating equations
Marginal and conditional effects when responses are MAR
Which Scottish counties have a high risk of lip cancer?
Standardized mortality ratios
Random-intercept Poisson regression
Model specification
Estimation using gllamm
Prediction of standardized mortality ratios
Nonparametric maximum likelihood estimation
Specification
Estimation using gllamm
Prediction
Summary and further reading
Exercises
VII Models for survival or duration data
Introduction to models for survival or duration data (part VII)
14. DISCRETE-TIME SURVIVAL
Introduction
Single-level models for discrete-time survival data
Discrete-time hazard and discrete-time survival
Data expansion for discrete-time survival analysis
Estimation via regression models for dichotomous responses
Including covariates
Time-constant covariates
Time-varying covariates
Multiple absorbing events and competing risks
Handling left-truncated data
How does birth history affect child mortality?
Data expansion
Proportional hazards and interval-censoring
Complementary log-log models
A random-intercept complementary log-log model
Model specification
Estimation using Stata
Population-averaged or marginal vs. subject-specific or conditional survival probabilities
Summary and further reading
Exercises
15. CONTINUOUS-TIME SURVIVAL
Introduction
What makes marriages fail?
Hazards and survival
Proportional hazards models
Piecewise exponential model
Cox regression model
Poisson regression with smooth baseline hazard
Accelerated failure-time models
Log-normal model
Time-varying covariates
Does nitrate reduce the risk of angina pectoris?
Marginal modeling
Cox regression
Poisson regression with smooth baseline hazard
Multilevel proportional hazards models
Cox regression with gamma shared frailty
Poisson regression with normal random intercepts
Poisson regression with normal random intercept and random coefficient
Multilevel accelerated failure-time models
Log-normal model with gamma shared frailty
Log-normal model with log-normal shared frailty
A fixed-effects approach
Cox regression with subject-specific baseline hazards
Different approaches to recurrent-event data
Total time
Counting process
Gap time
Summary and further reading
Exercises
VIII Models with nested and crossed random effects
16. MODELS WITH NESTED AND CROSSED RANDOM EFFECTS
Introduction
Did the Guatemalan immunization campaign work?
A three-level random-intercept logistic regression model
Model specification
Measures of dependence and heterogeneity
Types of residual intraclass correlations of the latent responses
Types of median odds ratios
Three-stage formulation
Estimation of three-level random-intercept logistic regression models
Using gllamm
Using xtmelogit
A three-level random-coefficient logistic regression model
Estimation of three-level random-coefficient logistic regression models
Using gllamm
Using xtmelogit
Prediction of random effects
Empirical Bayes prediction
Empirical Bayes modal prediction
Different kinds of predicted probabilities
Predicted population-averaged or marginal probabilities: New clusters
Predicted median or conditional probabilities
Predicted posterior mean probabilities: Existing clusters
Do salamanders from different populations mate successfully?
Crossed random-effects logistic regression
Summary and further reading
Exercises
A. SYNTAX FOR GLLAMM, EQ, AND GLLAPRED: THE BARE ESSENTIALS
B. SYNTAX FOR GLLAMM
C. SYNTAX FOR GLLAPRED
D. SYNTAX FOR GLLASIM
References