| |
Regression Models for Categorical Dependent Variables using Stata, 2nd Edition - by J. Scott Long and Jeremy Freese
Although
regression models for categorical dependent variables are common, very
few texts explain how to interpret these models. Regression Models for Categorical Dependent Variables Using Stata,
Second Edition, fills this void, showing how to fit and interpret
regression models for categorical data with Stata; it includes some
commands written by the authors. Hypothesis testing and goodness-of-fit
statistics are also discussed
The book begins with an excellent introduction to
Stata and then provides a general treatment of estimation, testing,
fit, and interpretation in this class of models. Binary, ordinal,
nominal, and count outcomes are covered in detail in separate chapters.
The final chapter discusses how to fit and interpret models with
special characteristics, such as ordinal and nominal independent
variables, interaction, and nonlinear terms. One appendix discusses the
syntax of the author-written commands, and a second gives details of
the datasets used by the authors in the book
This book is filled with concrete examples. Because
all the examples, datasets, and author-written commands are available
from the authors' web site, readers can easily replicate the examples
using Stata. This book is ideal for students or applied researchers who
want to know how to fit this type of model and understand its output.
The second edition is nearly 50% longer than the
previous edition. It has a new chapter on estimating and interpreting
models for nominal outcomes with alternative-specific data , including
the multinomial-probit model, the rank-ordered-logit model and
conditional logistic regression. New sections on the estimation and
interpretation of the stereotype-logistic model and zero-truncated
count models have also been added. Many of the interpretation
techniques have been updated to include both point and interval
estimates.
Table of Contents
Preface
Part I General Information
1 Introduction
- 1.1 What is this book about?
- 1.2 Which models are considered?
- 1.3 Whom is this book for?
- 1.4 How is the book organized?
- 1.5 What software do you need?
- 1.5.1 Updating Stata 9
- 1.5.2 Installing SPost
- Installing SPost using search
- Installing SPost using net install
- 1.5.3 What if commands do not work?
- 1.5.4 Uninstalling SPost
- 1.5.5 Using spex to load data and run examples
- 1.5.6 More files available on the web site
- 1.6 Where can I learn more about the models?
2 Introduction to
Stata
- 2.1 The Stata interface
- Changing the scrollback buffer size
- Changing the display of variable names in the
Variables window
- 2.2 Abbreviations
- 2.3 How to get help
- 2.3.1 Online help
- 2.3.2 Manuals
- 2.3.3 Other resources
- 2.4 The working directory
- 2.5 Stata file types
- 2.6 Saving output to log files
- Options
- 2.6.1 Closing a log file
- 2.6.2 Viewing a log file
- 2.6.3 Converting from SMCL to plain text or
PostScript
- 2.7 Using and saving datasets
- 2.7.1 Data in Stata format
- 2.7.2 Data in other formats
- 2.7.3 Entering data by hand
- 2.8 Size limitations on datasets*
- 2.9 Do-files
- 2.9.1 Adding comments
- 2.9.2 Long lines
- 2.9.3 Stopping a do-file while it is running
- 2.9.4 Creating do-files
- Using Stata's Do-file Editor
- Using other editors to create do-files
- 2.9.5 A recommended structure for do-files
- 2.10 Using Stata for serious data analysis
- 2.11 Syntax of Stata commands
- 2.11.1 Commands
- 2.11.2 Variable lists
- 2.11.3 if and in qualifiers
- Examples of if qualifier
- 2.11.4 Options
- 2.12 Managing data
- 2.12.1 Looking at your data
- 2.12.2 Getting information about variables
- 2.12.3 Missing values
- 2.12.4 Selecting observations
- 2.12.5 Selecting variables
- 2.13 Creating new variables
- 2.13.1 generate command
- 2.13.2 replace command
- 2.13.3 recode command
- 2.13.4 Common transformations for RHS variables
- Breaking a categorical variable into a set of
binary variables
- More examples of creating binary variables
- Nonlinear transformations
- Interaction terms
- 2.14 Labeling variables and values
- 2.14.1 Variable labels
- 2.14.2 Value labels
- 2.14.3 notes command
- 2.15 Global and local macros
- 2.16 Graphics
- 2.16.1 graph command
- 2.16.2 Displaying previously drawn graphs
- 2.16.3 Printing graphs
- 2.16.4 Combining graphs
- 2.17 A brief tutorial
- A batch version
3 Estimation, testing,
fit, and interpretation
- 3.1 Estimation
- 3.1.1 Stata's output for ML estimation
- 3.1.2 ML and sample size
- 3.1.3 Problems in obtaining ML estimates
- 3.1.4 Syntax of estimation commands
- Variable lists
- Specifying the estimation sample
- Options
- 3.1.5 Reading the output
- Header
- Estimates and standard errors
- Confidence intervals
- 3.1.6 Storing estimation results
- 3.1.7 Reformatting output with estimates table
- 3.1.8 Reformatting output with estout
- 3.1.9 Alternative output with listcoef
- Options for types of coefficients
- Options for mlogit, mprobit, and slogit
- Other options
- Standardized coefficients
- Factor and percent change
- 3.2 Postestimation analysis
- 3.3 Testing
- 3.3.1 Wald tests
- The accumulate option
- 3.3.2 LR tests
- Avoiding invalid LR tests
- 3.4 estat command
- 3.5 Measures of fit
- Syntax of fitstat
- Options
- Models and measures
- Example of fitstat
- Methods and formulas for fitstat
- 3.6 Interpretation
- 3.6.1 Approaches to interpretation
- 3.6.2 Predictions using predict
- 3.6.3 Overview of prvalue, prchange, prtab, and
prgen
- Specifying the levels of variables
- Options controlling output
- 3.6.4 Syntax for prvalue
- Options
- Options for confidence intervals
- Options used for bootstrapped confidence intervals
- 3.6.5 Syntax for prchange
- Options
- 3.6.6 Syntax for prtab
- Options
- 3.6.7 Syntax for prgen
- Options
- Options for confidence intervals and marginals
- Variables generated
- 3.6.8 Computing marginal effects using mfx
- 3.7 Confidence intervals for prediction
- 3.8 Next steps
Part II Models for Specific Kinds of
Outcomes
4 Models
for binary outcomes
- 4.1 The statistical model
- 4.1.1 A latent-variable model
- 4.1.2 A nonlinear probability model
- 4.2 Estimation using logit and probit
- Variable lists
- Specifying the estimation sample
- Weights
- Options
- Example
- 4.2.1 Observations predicted perfectly
- 4.3 Hypothesis testing with test and lrtest
- 4.3.1 Testing individual coefficients
- One- and two-tailed tests
- Testing single coefficients using test
- Testing single coefficients using lrtest
- 4.3.2 Testing multiple coefficients
- Testing multiple coefficients using test
- Testing multiple coefficients using lrtest
- 4.3.3 Comparing LR and Wald tests
- 4.4 Residuals and influence using predict
- 4.4.1 Residuals
- Example
- 4.4.2 Influential cases
- 4.4.3 Least likely observations
- Syntax
- Options
- Options for controlling the list of values
- 4.5 Measuring fit
- 4.5.1 Scalar measures of fit using fitstat
- 4.5.2 Hosmer–Lemeshow statistic
- 4.6 Interpretation using predicted values
- 4.6.1 Predicted probabilities with predict
- 4.6.2 Individual predicted probabilities with
prvalue
- 4.6.3 Tables of predicted probabilities with prtab
- 4.6.4 Graphing predicted probabilities with prgen
- 4.6.5 Plotting confidence intervals
- 4.6.6 Changes in predicted probabilities
- Marginal change
- Discrete change
- 4.7 Interpretation using odds ratios with listcoef
- Multiplicative coefficients
- Effect of the base probability
- Percent change in the odds
- 4.8 Other commands for binary outcomes
5 Models for ordinal
outcomes
- 5.1 The statistical model
- 5.1.1 A latent-variable model
- 5.1.2 A nonlinear probability model
- 5.2 Estimation using ologit and oprobit
- Variable lists
- Specifying the estimation sample
- Weights
- Options
- 5.2.1 Example of attitudes toward working mothers
- 5.2.2 Predicting perfectly
- 5.3 Hypothesis testing with test and lrtest
- 5.3.1 Testing individual coefficients
- 5.3.2 Testing multiple coefficients
- 5.4 Scalar measures of fit using fitstat
- 5.5 Converting to a different parameterization*
- 5.6 The parallel regression assumption
- 5.7 Residuals and outliers using predict
- 5.8 Interpretation
- 5.8.1 Marginal change in "y"
- 5.8.2 Predicted probabilities
- 5.8.3 Predicted probabilities with predict
- 5.8.4 Individual predicted probabilities with
prvalue
- 5.8.5 Tables of predicted probabilities with prtab
- 5.8.6 Graphing predicted probabilities with prgen
- 5.8.7 Changes in predicted probabilities
- Marginal change with prchange
- Marginal change with mfx
- Discrete change with prchange
- Confidence intervals for discrete changes
- Computing discrete change for a 10-year increase
in age
- 5.8.8 Odds ratios using listcoef
- 5.9 Less-common models for ordinal outcomes
- 5.9.1 Generalized ordered logit model
- 5.9.2 The stereotype model
- 5.9.3 The continuation ratio model
6 Models for nominal
outcomes with case-specific data
- 6.1 The multinomial logit model
- 6.1.1 Formal statement of the model
- 6.2 Estimation using mlogit
- Variable lists
- Specifying the estimation sample
- Weights
- Options
- 6.2.1 Example of occupational attainment
- 6.2.2 Using different base categories
- 6.2.3 Predicting perfectly
- 6.3 Hypothesis testing of coefficients
- 6.3.1 mlogtest for tests of the MNLM
- Options
- 6.3.2 Testing the effects of the independent
variables
- A likelihood-ratio test
- A Wald test
- Testing multiple independent variables
- 6.3.3 Tests for combining alternatives
- A Wald test for combining alternatives
- Using test [category]*
- An LR test for combining alternatives
- Using constraint with lrtest*
- 6.4 Independence of irrelevant alternatives
- Hausman test of IIA
- Small–Hsiao test of IIA
- Conclusions regarding tests of IIA
- 6.5 Measures of fit
- 6.6 Interpretation
- 6.6.1 Predicted probabilities
- 6.6.2 Predicted probabilities with predict
- Using predict to compare mlogit and ologit
- 6.6.3 Predicted probabilities and discrete change
with prvalue
- 6.6.4 Tables of predicted probabilities with prtab
- 6.6.5 Graphing predicted probabilities with prgen
- Plotting probabilities for one outcome and two
groups
- Graphing probabilities for all outcomes for one
group
- 6.6.6 Changes in predicted probabilities
- Computing marginal and discrete change with
prchange
- Marginal change with mfx compute
- 6.6.7 Plotting discrete changes with prchange and
mlogview
- 6.6.8 Odds ratios using listcoef and mlogview
- Listing odds ratios with listcoef
- Plotting odds ratios
- 6.6.9 Using mlogplot*
- 6.6.10 Plotting estimates from matrices with
mlogplot*
- Options for using matrices with mlogplot
- Global macros and matrices used by mlogplot
- Example
- 6.7 Multinomial probit model with IIA
- 6.8 Stereotype logistic regression
- 6.8.1 Formal statement of the one-dimensional SLM
- 6.8.2 Fitting the SLM with slogit
- Options
- Example
- 6.8.3 Interpretation using predicted probabilities
- 6.8.4 Interpretation using odds ratios
- 6.8.5 Distinguisability and the φ parameters
- 6.8.6 Ordinality in the one-dimensional SLM
- Higher-dimension SLM
7 Models for nominal
outcomes with alternative-specific data
- 7.1 Alternative-specific data organization
- 7.1.1 Syntax for case2alt
- 7.2 The conditional logit model
- 7.2.1 Fitting the conditional logit model
- Example of the clogit model
- 7.2.2 Interpreting odds ratios from clogit
- 7.2.3 Interpreting probabilities from clogit
- Using predict
- Using asprvalue
- 7.2.4 Fitting the multinomial logit model using
clogit
- Setting up the data with case2alt
- Fitting multinomial logit with clogit
- 7.2.5 Using clogit with case- and
alternative-specific variables
- Example of a mixed model
- Interpretation of odds ratios using listcoef
- Interpretation of predicted probabilities using
asprvalue
- Allow the effects of alternative-specific
variables to vary over the alternatives
- 7.3 Alternative-specific multinomial probit
- 7.3.1 The model
- 7.3.2 Informal explanation of estimation by
simulation
- 7.3.3 Alternative-based data with uncorrelated
errors
- Options
- Examples
- 7.3.4 Alternative-based data with correlated errors
- 7.4 The sturctural covariance matrix
- 7.4.1 Interpretation using probabilities
- Using predict
- Using asprvalue
- 7.4.2 Identification, discrete change, and marginal
effects
- 7.4.3 Testing for IIA
- 7.4.4 Adding case-specific data
- 7.5 Rank-ordered logistic regression
- 7.5.1 Fitting the rank-ordered logit model
- Options
- Example of the rank-ordered logit model
- 7.5.2 Interpreting results from rologit
- Interpretation using odds ratios
- Interpretation using predicted probabilties
- 7.6 Conclusions
8 Models for Count
Outcomes
- 8.1 The Poisson distribution
- 8.1.1 Fitting the Poisson distribution with the
poisson command
- 8.1.2 Computing predicted probabilities with
prcounts
- Syntax
- Options
- Variables generated
- 8.1.3 Comparing observed and predicted counts with
prcounts
- 8.2 The Poisson regression model
- 8.2.1 Estimating the PRM with poisson
- Variable lists
- Specifying the estimation sample
- Weights
- Options
- 8.2.2 Example of fitting the PRM
- 8.2.3 Interpretation using the rate, μ
- Factor change in E(y|x)
- Percent change in E(y|x)
- Example of factor and percent change
- Marginal change in E(y|x)
- Example of marginal change using prchange
- Example of marginal change using mfx
- Discrete change in E(y|x)
- Example of discrete change using prchange
- Example of discrete change with confidence
intervals
- 8.2.4 Interpretation using predicted probabilities
- Example of predicted probabilities using prvalue
- Example of predicted probabilities using prgen
- Example of predicted probabilities using prcounts
- 8.2.5 Exposure time*
- 8.3 The negative binomial regression model
- 8.3.1 Fitting the NBRM with nbreg
- NB1 and NB2 variance functions
- 8.3.2 Example of fitting the NBRM
- Comparing the PRM and NBRM using estimates table
- 8.3.3 Testing for overdispersion
- 8.3.4 Interpretation using the rate μ
- 8.3.5 Interpretation using predicted probabilities
- 8.4 Models for truncated counts
- 8.4.1 Fitting zero-truncated models
- 8.4.2 Example of fitting zero-truncated models
- 8.4.3 Interpretation of parameters
- 8.4.4 Interpretation using predicted probabilities
and rates
- 8.4.5 Computing predicted rates and probabilities in
the estimation sample
- 8.5 The hurdle regression model*
- 8.5.1 In-sample predictions for the hurdle model
- 8.5.2 Predictions for user-specified values
- 8.6 Zero-inflated count models
- 8.6.1 Fitting zero-inflated models with zinb and zip
- Variable lists
- Options
- 8.6.2 Example of fitting the ZIP and ZINB models
- 8.6.3 Interpretation of coefficients
- 8.6.4 Interpretation of predicted probabilities
- Predicted probabilities with prvalue
- Confidence intervals with prvalue
- Predicted probabilities with prgen
- 8.7 Comparisons among count models
- 8.7.1 Comparing mean probabilities
- 8.7.2 Tests to compare count models
- LR tests of α
- Vuong test nonnested models
- 8.8 Using countfit to compare count models
9 More topics
- 9.1 Ordinal and nominal independent variables
- 9.1.1 Coding a categorical independent variable as a
set of dummy variables
- 9.1.2 Estimation and interpretation with categorical
independent variables
- 9.1.3 Tests with categorical independent variables
- Testing the effect of membership in one category
versus the reference category
- Testing the effect of membership in two
nonreference categories
- Testing that a categorical independent variable
has no effect
- Testing whether treating an ordinal variable as
interval loses information
- 9.1.4 Discrete change for categorical independent
variables
- Computing discrete change with prchange
- Computing discrete change with prvalue
- 9.2 Interactions
- 9.2.1 Computing sex differences in predictions with
interactions
- 9.2.2 Computing sex differences in discrete change
with interactions
- 9.3 Nonlinear models
- 9.3.1 Adding nonlinearities to linear predictors
- 9.3.2 Discrete change in nonlinear models
- 9.4 Using praccum and forvalues to plot predictions
- Options
- 9.4.1 Example using age and age-squared
- 9.4.2 Using forvalues with praccum
- 9.4.3 Using praccum for graphing a transformed
variable
- 9.4.4 Using praccum to graph interactions
- 9.4.5 Using forvalues with prvalue to create tables
- 9.4.6 A more advanced example*
- 9.4.7 Using forvalues to create tables with other
commands
- 9.5 Extending SPost to other estimation commands
- 9.6 Using Stata more efficiently
- 9.6.1 profile.do
- 9.6.2 Changing screen fonts and window preferences
- 9.6.3 Using ado-files for changing directories
- 9.6.4 me.hlp file
- 9.7 Conclusions
A Syntax for SPost
Commands
- A.1 asprvalue
- Syntax
- Description
- Options
- Examples
- A.2 brant
- Syntax
- Description
- Options
- Examples
- Saved results
- A.3 case2alt
- Syntax
- Description
- Options
- Examples
- A.4 countfit
- Syntax
- Description
- Options for specifying the model
- Options to select the models to fit
- Options to label and save results
- Options to control what is printed
- Example
- A.5 fitstat
- Syntax
- Description
- Options
- Examples
- Saved results
- A.6 leastlikely
- Syntax
- Description
- Options
- Options for listing
- Examples
- A.7 listcoef
- Syntax
- Description
- Options
- Options for nominal outcomes
- Examples
- Saved results
- A.8 misschk
- Syntax
- Options
- Examples
- A.9 mlogplot
- Syntax
- Description
- Options
- Examples
- A.10 mlogtest
- Syntax
- Description
- Options
- Examples
- Saved results
- Acknowledgment
- A.11 mlogview
- Syntax
- Description
- Dialog box controls
- A.12 Overview of prchange, prgen, prtab, and prvalue
- Syntax
- Examples
- A.13 praccum
- Syntax
- Description
- Options
- Examples
- Variables generated
- A.14 prchange
- Syntax
- Description
- Options
- Examples
- A.15 prcounts
- Syntax
- Description
- Options
- Variables generated
- Examples
- A.16 prgen
- Syntax
- Description
- Options
- Options for confidence intervals and marginals
- Examples
- Variables generated
- A.17 prtab
- Syntax
- Description
- Options
- Examples
- A.18 prvalue
- Syntax
- Description
- Options
- Options for confidence intervals
- Options used for bootstrapped confidence intervals
- Examples
- Saved results
B Description of datasets
- B.1 binlfp2
- B.2 couart2
- B.3 gsskidvalue2
- B.4 nomocc2
- B.5 ordwarm2
- B.6 science2
- B.7 travel2
- B.8 wlsrnk
References
Author index
Subject index


|
|