Interpreting and Visualizing Regression Models Using Stata

Michael Mitchell’s Interpreting and Visualizing Regression Models Using Stata, Second Edition is a clear treatment of how to carefully present results from model-fitting in a wide variety of settings. It is a boon to anyone who has to present the tangible meaning of a complex model clearly, regardless of the audience. As an example, many experienced researchers start to squirm when asked to give a simple explanation of the practical meaning of interactions in nonlinear models such as logistic regression. The techniques presented in Mitchell’s book make answering those questions easy. The overarching theme of the book is that graphs make interpreting even the most complicated models containing interaction terms, categorical variables, and other intricacies straightforward.

 

Using a dataset based on the General Social Survey, Mitchell starts with a basic linear regression with a single independent variable and then illustrates how to tabulate and graph predicted values. Mitchell focuses on Stata’s margins and marginsplot commands, which play a central role in the book and which greatly simplify the calculation and presentation of results from regression models. In particular, through use of the marginsplot command, he shows how you can graphically visualize every model presented in the book and thus gain insight into results much easier when you can view them in a graph rather than in a mundane table of results.

 

Mitchell then proceeds to more complicated models where the effects of the independent variables are nonlinear. After discussing how to detect nonlinear effects, he presents examples using both standard polynomial models, where independent variables can be raised to powers like -1 or 1/2. In all cases, Mitchell again uses the marginsplot command to illustrate the effect that changing an independent variable has on the dependent variable. Piecewise linear models are presented as well; these are linear models in which the slope or intercept is allowed to change depending on the range of an independent variable. He also uses the contrast command when discussing categorical variables; as the name suggests, this command allows you to easily contrast predictions made for various levels of the categorical variable.

 

Interaction terms can be tricky to interpret, but Mitchell shows how graphs produced by marginsplot greatly clarify results. Individual chapters are devoted to two- and three-way interactions containing all continuous or all categorical variables and include many practical examples. Raw regression output including interactions of continuous and categorical variables can be nearly impossible to interpret, but again Mitchell makes this a snap through judicious use of the margins and marginsplot commands in subsequent chapters.

 

The first two-thirds of the book is devoted to cross-sectional data, while the final third considers longitudinal data and complex survey data. A significant difference between this book and most others on regression models is that Mitchell spends quite some time on fitting and visualizing discontinuous models–models where the outcome can change value suddenly at thresholds. Such models are natural in settings such as education and policy evaluation, where graduation or policy changes can make sudden changes in income or revenue.

 

The second edition has been updated to incorporate many new features added since Stata 12, when the first edition was written. Specifically, the text now demonstrates how labels on the values of categorical variables make interpretation much easier when looking at regression results and results from the margins and contrast commands. For instance, you now see that your coefficients or marginal means are related to the “low-dose” and “high-dose” groups instead of groups 1 and 2. In addition, Mitchell now shows you how to customize output from estimation commands, margins, and contrast for even more clarity. In his discussion of customizing graphs produced by marginsplot, he demonstrates new graph features such as the use of transparency. He also includes new examples of multilevel models for longitudinal data that take advantage of the degree-of-freedom adjustments for small sample sizes that are now provided by mixed and contrast.

 

This book is a worthwhile addition to the library of anyone involved in statistical consulting, teaching, or collaborative applied statistical environments. Graphs greatly aid the interpretation of regression models, and Mitchell’s book shows you how.

List of Tables
List of Figures
Preface to the Second Edition

Preface to the First Edition

Acknowledgements

 

INTRODUCTION

Read me first
The GSS dataset

Income
Age
Education
Gender
The pain datasets
The optimism datasets
The school datasets
The sleep datasets
Overview of the book

 

I CONTINUOS PREDICTORS

CONTINUOS PREDICTORS: LINEAR

Chapter overview
Simple linear regression

Computing predicted means using the margins command
Graphing predicted means using the marginsplot command

Multiple regression

Computing adjusted means using the margins command
Some technical details about adjusted means
Graphing adjusted means using the marginsplot command

Checking for nonlinearity graphically

Using scatterplots to check for nonlinearity
Checking for nonlinearity using residuals
Checking for nonlinearity using locally weighted smoother
Graphing outcome mean at each level of predictor
Summary

Checking for nonlinearity analytically

Adding power terms
Using factor variables

Summary

 

CONTINUOS PREDICTORS: POLYNOMIALS

Chapter overview
Quadratic (squared) terms

Overview
Examples

Cubic (third power) terms

Overview
Examples

Fractional polynomial regression

Overview
Example using fractional polynomial regression

Main effects with polynomial terms
Summary

 

CONTINUOS PREDICTORS: PIECEWISE MODELS

Chapter overview
Introduction to piecewise regression models
Piecewise with one known knot

Overview
Examples using the GSS

Piecewise with two known knots

Overview
Examples using the GSS

Piecewise with one knot and one jump

Overview
Examples using the GSS

Piecewise with two knots and two jumps

Overview
Examples using the GSS

Piecewise with an unknown knot
Piecewise model with multiple unknown knots
Piecewise models and the marginsplot command
Automating graphs of piecewise models
Summary

 

CONTINUOUS BY CONTINUOUS INTERACTIONS

Chapter overview
Linear by linear interactions

Overview
Example using GSS data
Interpreting the interaction in terms of age
Interpreting the interaction in terms of education
Interpreting the interaction in terms of age slope
Interpreting the interaction in terms of the educ slope

Linear by quadratic interactions

Overview
Example using GSS data

Summary

 

CONTINUOUS BY CONTINUOUS BY CONTINUOUS INTERACTIONS

Chapter overview
Overview
Examples using the GSS data

A model without a three-way interaction
A three-way interaction model

Summary

 

II CATEGORICAL PREDICTORS

CATEGORICAL PREDICTORS

Chapter overview
Comparing two groups using a t test
More groups and more predictors
Overview of contrast operators
Compare each group against a reference group

Selecting a specific contrast
Selecting a different reference group
Selecting a contrast and reference group

Compare each group against the grand mean

Selecting a specific contrast

Compare adjacent means

Reverse adjacent contrasts
Selecting a specific contrast

Comparing the mean of subsequent or previous levels

Comparing the mean of previous levels
Selecting a specific contrast

Polynomial contrasts
Custom contrasts
Weighted contrasts
Pairwise comparisons
Interpreting confidence intervals
Testing categorical variables using regression
Summary

 

CATEGORICAL BY CATEGORICAL INTERACTIONS

Chapter overview
Two by two models: Example 1

Simple effects
Estimating the size of the interaction
More about interaction

Summary

Two by three models

Example 2
Example 3
Summary

Three by three models: Example 4

Simple effects
Simple contrasts
Partial interaction
Interaction contrasts
Summary

Unbalanced designs
Main effects with interactions: anova versus regress
Interpreting confidence intervals
Summary

 

CATEGORICAL BY CATEGORICAL BY CATEGORICAL INTERACTIONS

Chapter overview
Two by two by two models

Simple interactions by season
Simple interactions by depression status
Simple effects

Two by two by three models

Simple interactions by depression status
Simple partial interaction by depression status
Simple contrasts
Partial interactions

Three by three by three models and beyond

Partial interactions and interaction contrasts
Simple interactions
Simple effects and simple comparisons

Summary

 

III CONTINUOS AND CATEGORICAL PREDICTORS

LINEAR BY CATEGORICAL INTERACTIONS

Chapter overview
Linear and two-level categorical: No interaction

Overview
Examples using the GSS

Linear by two-level categorical interactions

Overview
Examples using the GSS

Linear by three-level categorical interactions10.4.1 Overview

Overview

Examples using the GSS

Summary

 

POLYNOMIAL BY CATEGORICAL INTERACTIONS

Chapter overview
Quadratic by categorical interactions

Overview
Quadratic by two-level categorical
Quadratic by three-level categorical

Cubic by categorical interactions
Summary

 

PIECEWISE BY CATEGORICAL INTERACTIONS

Chapter overview
One knot and one jump

Comparing slopes across gender
Comparing slopes across education
Difference in differences of slopes
Comparing changes in intercepts
Computing and comparing adjusted means
Graphing adjusted means

Two knots and two jumps

Comparing slopes across gender
Comparing slopes across education
Difference in differences of slopes
Comparing changes in intercepts by gender
Comparing changes in intercepts by education
Computing and comparing adjusted means
Graphing adjusted means

Comparing coding schemes

Coding scheme #1
Coding scheme #2
Coding scheme #3
Coding scheme #4
Choosing coding schemes

Summary

 

CONTINUOUS BY CONTINUOUS BY CATEGORICAL INTERACTIONS

Chapter overview
Linear by linear by categorical interactions

Fitting separate models for males and females
Fitting a combined model for males and females
Interpreting the interaction focusing in the age slope
Interpreting the interaction focusing on the educ slope
Estimating and comparing adjusted means by gender

Linear by quadratic by categorical interactions

Fitting separate models for males and females
Fitting a common model for males and females
Interpreting the interaction
Estimating and comparing adjusted means by gender

Summary

 

CONTINUOUS BY CATEGORICAL BY CATEGORICAL INTERACTIONS

Chapter overview
Simple effects of gender on the age slope
Simple effects of education on the age slope
Simple contrasts on education for the age slope
Partial interaction on education for the age slope
Summary

 

IV BEYOND ORDINARY LINEAR REGRESSION

MULTILEVEL MODELS

Chapter overview
Example 1: Continuous by continuous interaction
Example 2: Continuous by categorical interaction
Example 3: Categorical by continuous interaction
Example 4: Categorical by categorical interaction
Summary

 

TIME AS A CONTINUOUS PREDICTOR

Chapter overview
Example 1: Linear effect of time
Example 2: Linear effect of time by a categorical predictor
Example 3: Piecewise modeling of time
Example 4: Piecewise effects of time by a categorical predictor

Baseline slopes
Change in slopes: Treatment versus baseline
Jump at treatment
Comparisons among groups

Summary

 

TIME AS A CATEGORICAL PREDICTOR

Chapter overview
Example 1: Time treated as a categorical variable
Example 2: Time (categorical) by two groups
Example 3: Time (categorical) by three groups
Comparing models with different residual covariance structures
Summary

 

NONLINEAR MODELS

Chapter overview
Binary logistic regression

A logistic model with one categorical predictor
A logistic model with one continuous predictor
A logistic model with covariates

Multinomial logistic regression
Ordinal logistic regression
Poisson regression
More applications of nonlinear models

Categorical by categorical interaction
Categorical by continuous interaction
Piecewise modeling

Summary

 

COMPLEX SURVEY DATA

 

V APPENDICES

CUSTOMIZING OUTPUT FROM ESTIMATION COMMANDS
Omission of output
Specifying the confidence level
Customizing the formatting of columns in the coefficient table
Customizing the display of factor variables
THE MARGINS COMMAND
The predict() and expression() options
The at() option
Margins with factor variables
Margins with factor variables and the at() option
The dydx() and related options
Specifying the confidence level
Customizing column formatting
THE MARGINSPLOT COMMAND
THE CONTRAST COMMAND
Inclusion and omission of output
Customizing the display of factor variables
Adjustments for multiple comparisons
Specifying the confidence level
Customizing column formatting

THE PW COMPARE COMMAND

 

 

Author: Michael N. Mitchell
Edition: Second Edition
ISBN-13: 978-1-59718-321-5
©Copyright: 2021

Michael Mitchell’s Interpreting and Visualizing Regression Models Using Stata, Second Edition is a clear treatment of how to carefully present results from model-fitting in a wide variety of settings. It is a boon to anyone who has to present the tangible meaning of a complex model in a clear fashion, regardless of the audience.