Microeconometrics Using Stata, Volume I: Cross-Sectional and Panel Regression Methods

Any applied economic researcher using Stata and anyone teaching or studying microeconometrics will benefit from Cameron and Trivedi’s two volumes. They are an invaluable reference of the theory and intuition behind microeconometric methods using Stata. Those familiar with Cameron and Trivedi’s Microeconometrics: Methods and Applications will find the same rigor. Those familiar with the previous edition of Microeconometrics Using Stata will find the same explanation of Stata commands, their interpretation, and their connection with microeconometric theory as well as an introduction to computational concepts that should be part of any researcher’s toolbox.

 

This new edition covers all the new Stata developments relevant to microeconometrics that appeared since the the last edition in 2010. It also covers the most recent microeconometric methods that have been contributed by the Stata community but have not yet made it to Stata. For example, readers will find entire new chapters on treatment effects, duration models, spatial autoregressive models, lasso, and Bayesian analysis.

 

The first volume introduces foundational microeconometric methods, including linear and nonlinear methods for cross-sectional data and linear panel data with and without endogeneity as well as overviews of hypothesis and model-specification tests. Beyond this, it teaches bootstrap and simulation methods, quantile regression, finite mixture models, and nonparametric regression. It also includes an introduction to basic Stata concepts and programming and to Mata for matrix programming and basic optimization.

 

The second volume builds on methods introduced in the first volume and walks readers through a wide range of more advanced methods useful in economic research. It starts with an introduction to nonlinear optimization methods and then delves into binary outcome methods with and without endogeneity; tobit and selection model estimates with and without endogeneity; choice model estimation; count data with and without endogeneity for conditional means and count data for conditional quantiles; survival data; nonlinear panel-data methods with and without endogeneity; exogenous and endogenous treatment effects; spatial data modeling; semiparametric regression; lasso for prediction and inference; and Bayesian econometrics.

 

This is just a brief overview of the contents of the book, but it exemplifies the breadth and ambition of the two volumes. In sum, it is an essential book for any applied researcher and advanced microeconometrics courses.

List of tables
List of figures

Preface to the Second Edition (PDF)

 

STATA BASICS

 

Interactive use
Documentation
Command syntax and operators
Do-files and log files
Scalars and matrices
Using results from Stata commands
Global and local macros
Looping commands
Mata and Python in Stata
Some useful commands
Template do-file
Community-contributed commands
Additional resources
Exercises

 

 DATA MANAGEMENT AND GRAPHICS

 

Introduction
Types of data
Inputting data
Data management
Manipulating datasets
Graphical display of data
Additional resources
Exercises

 

LINEAR REGRESSION BASICS

 

Introduction
Data and data summary
Transformation of data before regression
Linear regression
Basic regression analysis
Specification analysis
Specification tests
Sampling weights
OLS using Mata
Additional resources
Exercises

 

LINEAR REGRESSION EXTENSIONS

 

Introduction
In-sample prediction
Out-of-sample prediction
Predictive margins
Marginal effects
Regression decomposition analysis
Shapley decomposition of relative regressor importance
Differences-in-differences estimators
Additional resources
Exercises

 

SIMULATION

 

Introduction
Pseudorandom-number generators
Distribution of the sample mean
Pseudorandom-number generators: Further details
Computing integrals
Simulation for regression: Introduction
Additional resources
Exercises

 

LINEAR REGRESSION WITH CORRELATED ERRORS

 

Introduction
Generalized least-squares and FGLS regression
Modeling heteroskedastic data
OLS for clustered data
FGLS estimators for clustered data
Fixed-effects estimator for clustered data
Linear mixed models for clustered data
Systems of linear regressions
Survey data: Weighting, clustering, and stratification
Additional resources
Exercises

 

LINEAR INSTRUMENTAL-VARIABLES

 

Introduction
Simultaneous equations model
Instrumental-variables estimation
Instrumental-variables example
Weak instruments
Diagnostics and tests for weak instruments
Inference with weak instruments
Finite sample inference with weak instruments
Other estimators
Three-stage least-squares systems estimation
Additional resources
Exercises

 

LINEAR PANEL-DATA MODELS: BASICS

 

Introduction
Panel-data methods overview
Summary of panel data
Pooled or population-averaged estimators
Fixed-effects or within estimator
Between estimator
Random-effects estimator
Comparison of estimators
First-difference estimator
Panel-data management
Additional resources
Exercises

 

LINEAR PANEL-DATA MODELS: EXTENSIONS

 

Introduction
Panel IV estimation
Hausman–Taylor estimator
Arellano–Bond estimator
Long panels
Additional resources
Exercises

 

 INTRODUCTION TO NONLINEAR REGRESSION

 

Introduction
Binary outcome models
Probit model
MEs and coefficient interpretation
Logit model
Nonlinear least squares
Other nonlinear estimators
Additional resources
Exercises

 

TESTS OF HYPOTHESES AND MODEL SPECIFICATION

 

Introduction
Critical values and p-values
Wald tests and confidence intervals
Likelihood-ratio tests
Lagrange multiplier test (or score test)
Multiple testing
Test size and power
The power onemean command for multiple regression
Specification tests
Permutation tests and randomization tests
Additional resources
Exercises

 

BOOTSTRAP METHODS

 

Introduction
Bootstrap methods
Bootstrap pairs using the vce(bootstrap) option
Bootstrap pairs using the bootstrap command
Percentile-t bootstraps with asymptotic refinement
Wild bootstrap with asymptotic refinement
Bootstrap pairs using bsample and simulate
Alternative resampling schemes
The jackknife
Additional resources
Exercises

 

NONLINEAR REGRESSION METHODS

 

Introduction
Nonlinear example: Doctor visits
Nonlinear regression methods
Different estimates of the VCE
Prediction
Predictive margins
Marginal effects
Model diagnostics
Clustered data
Additional resources
Exercises

 

FLEXIBLE REGRESSION: FINITE MIXTURES AND NONPARAMETRIC

 

Introduction
Models based on finite mixtures
FMM example: Earnings of doctors
Global polynomials
Regression splines
Nonparametric regression
Partially parametric regression
Additional resources
Exercises

 

QUANTILE REGRESSION

 

Introduction
Conditional quantile regression
CQR for medical expenditures data
CQR for generated heteroskedastic data
Quantile treatment effects for a binary treatment
Additional resources
Exercises

 

PROGRAMMING IN STATA

 

Stata matrix commands
Programs
Program debugging
Additional resources

 

MATA

 

How to run Mata
Mata matrix commands
Programming in Mata
Additional resources

 

OPTIMIZATION IN MATA

 

Mata moptimize() function
Mata optimize() function
Additional resources

 

Glossary of abbreviations
References
List of tables
List of figures

 

 

NONLINEAR OPTIMIZATION METHODS

 

Introduction
Newton–Raphson method
Gradient methods
Overview of ml, moptimize(), and optimize()
The ml command: lf method
Checking the program
The ml command: lf0–lf2, d0–d2, and gf0 methods
Nonlinear instrumental-variables (GMM) example
Additional resources
Exercises

 

BINARY OUTCOME MODELS

 

Introduction
Some parametric models
Estimation
Example
Goodness of fit and prediction
Marginal effects
Clustered data
Additional models
Endogenous regressors
Grouped and aggregate data
Additional resources
Exercises

 

MULTINOMIAL MODELS

 

Introduction
Multinomial models overview
Multinomial example: Choice of fishing mode
Multinomial logit model
Alternative-specific conditional logit model
Nested logit model
Multinomial probit model
Alternative-specific random-parameters logit
Ordered outcome models
Clustered data
Multivariate outcomes
Additional resources
Exercises

 

TOBIT AND SELECTION MODELS

 

Introduction
Tobit model
Tobit model example
Tobit for lognormal data
Two-part model in logs
Selection models
Nonnormal models of selection
Prediction from models with outcome in logs
Endogenous regressors
Missing data
Panel attrition
Additional resources
Exercises

 

COUNT-DATA MODELS

 

Introduction
Modeling strategies for count data
Poisson and negative binomial models
Hurdle model
Finite-mixture models
Zero-inflated models
Endogenous regressors
Clustered data
Quantile regression for count data
Additional resources
Exercises

 

SURVIVAL ANALYSIS FOR DURATION DATA

 

Introduction
Data and data summary
Survivor and hazard functions
Semiparametric regression model
Fully parametric regression models
Multiple-records data
Discrete-time hazards logit model
Time-varying regressors
Clustered data
Additional resources
Exercises

 

NONLINEAR PANEL MODELS

 

Introduction
Nonlinear panel-data overview
Nonlinear panel-data example
Binary outcome and ordered outcome models
Tobit and interval-data models
Count-data models
Panel quantile regression
Endogenous regressors in nonlinear panel models
Additional resources
Exercises

 

PARAMETRIC MODELS FOR HETEROGENEITY AND ENDOGENEITY

 

Introduction
Finite mixtures and unobserved heterogeneity
Empirical examples of FMMs
Nonlinear mixed-effects models
Structural equation models for linear structural equation models
Generalized structural equation models
ERM commands for endogeneity and selection
Additional resources
Exercises

 

RANDOMIZED CONTROL TRIALS AND EXOGENOUS TREATMENT EFFECTS

 

Introduction
Potential outcomes
Randomized control trials
Regression in an RCT
Treatment evaluation with exogenous treatment
Treatment evaluation methods and estimators
Stata commands for treatment evaluation
Oregon Health Insurance Experiment example
Treatment-effect estimates using the OHIE data
Multilevel treatment effects
Conditional quantile TEs
Additional resources
Exercises

 

ENDOGENOUS TREATMENT EFFECTS

 

Introduction
Parametric methods for endogenous treatment
ERM commands for endogenous treatment
ET commands for binary endogenous treatment
The LATE estimator for heterogeneous effects
Difference-in-differences and synthetic control
Regression discontinuity design
Conditional quantile regression with endogenous regressors
Unconditional quantiles
Additional resources
Exercises

 

SPATIAL REGRESSION

 

Introduction
Overview of spatial regression models
Geospatial data
The spatial weighting matrix
OLS regression and test for spatial correlation
Spatial dependence in the error
Spatial autocorrelation regression models
Spatial instrumental variables
Spatial panel-data models
Additional resources
Exercises

 

SEMIPARAMETRIC REGRESSION

 

Introduction
Kernel regression
Series regression
Nonparametric single regressor example
Nonparametric multiple regressor example
Partial linear model
Single-index model
Generalized additive models
Additional resources
Exercises

 

MACHINE LEARNING FOR PREDICTION AND INFERENCE

 

Introduction
Measuring the predictive ability of a model
Shrinkage estimators
Prediction using lasso, ridge, and elasticnet
Dimension reduction
Machine learning methods for prediction
Prediction application
Machine learning for inference in partial linear model
Machine learning for inference in other models
Additional resources
Exercises

 

BAYESIAN METHODS: BASICS

 

Introduction
Bayesian introductory example
Bayesian methods overview
An i.i.d. example
Linear regression
A linear regression example
Modifying the MH algorithm
RE model
Bayesian model selection
Bayesian prediction
Probit example
Additional resources
Exercises

 

BAYESIAN METHODS: MARKOV CHAIN MONTE CARLO ALGORITHMS

 

Introduction
User-provided log likelihood
MH algorithm in Mata
Data augmentation and the Gibbs sampler in Mata
Multiple imputation
Multiple-imputation example
Additional resources
Exercises

 

Glossary of abbreviations
References
Author: A. Colin Cameron and Pravin K. Trivedi
Edition: Second Edition
ISBN-13: 978-1-59718-359-8
©Copyright: 2022
Versione e-Book disponibile

Any applied economic researcher using Stata and anyone teaching or studying microeconometrics will benefit from Cameron and Trivedi’s two volumes. They are an invaluable reference of the theory and intuition behind microeconometric methods using Stata. Those familiar with Cameron and Trivedi’s Microeconometrics: Methods and Applications will find the same rigor. Those familiar with the previous edition of Microeconometrics Using Stata will find the same explanation of Stata commands, their interpretation, and their connection with microeconometric theory as well as an introduction to computational concepts that should be part of any researcher’s toolbox.