# Using Stata for Principles of Econometrics

This book is a supplement to Principles of Econometrics, 5th Edition by R. Carter Hill, William E. Griffiths and Guay C. Lim (Wiley, 2018), hereinafter POE5. This book is not a substitute for the textbook, nor is it a standalone computer manual. It is a companion to the textbook, showing how to perform the examples in the textbook using Stata Release 15. This book will be useful to students taking econometrics, as well as their instructors, and others who wish to use Stata for econometric analysis.

CHAPTER 1 INTRODUCING STATA

Starting Stata
The opening display
Exiting Stata
Stata data files for POE5

A working directory

Opening Stata data files

Using the toolbar
The use command
Using files on the internet
Locating book files on the internet

The variables window

Using the data utility for a single label

Describing data and obtaining summary statistics 9

The Stata help system 12

Using keyword search
Opening a dialog box
Complete documentation in Stata manuals
Statalist
Not elsewhere classified

Stata command syntax

Syntax of summarize
Learning syntax using the review window

Copying and pasting
Using a log file

Using the data browser
Using Stata graphics

Histograms
Scatter diagrams

Using Stata Do-files
Creating and managing variables

Creating (generating) new variables
Using the expression builder
Dropping or keeping variables and observations
Using arithmetic operators
Using Stata math functions

Using Stata density functions

Cumulative distribution functions
Inverse cumulative distribution functions

Using and displaying scalars

Example of standard normal cdf
Example of t-distribution tail-cdf
Example computing percentile of the standard normal
Example computing percentile of the t-distribution

A scalar dialog box
Using temporary scalars
Chapter 1 Do-file

CHAPTER 2 SIMPLE LINEAR REGRESSION

The food expenditure data

Starting a new problem
Starting a log file
Opening a Stata data file
Browsing and listing the data

Computing summary statistics
Creating a scatter diagram

Enhancing the plot

Regression

Fitted values and residuals
Plotting the fitted regression line

Using Stata to obtain predicted values

Using saved coefficients
Using lincom
Using the margins command
Using incomplete observations
Computing an elasticity

OLS estimator variances and covariance

Estimating the variance of the error term
Viewing estimated variances and covariance
Saving the Stata data file

Estimating nonlinear relationships

A log-linear model

Regression with indicator variables

Appendix 2A Average marginal effects

Elasticity in a linear relationship
Slope in a log-linear model

Appendix 2B Simulation experiments

Fixed x’s
Random x’s

Chapter 2 Do-file

CHAPTER 3 INTERVAL ESTIMATION AND HYPOTHESIS TESTING

Interval estimates

Critical values from the t-distribution
Creating an interval estimate
Creating an interval estimate using lincom

Hypothesis tests

Right-tail test of significance
Right-tail test of an economic hypothesis
Left-tail test of an economic hypothesis
Two-tail test of an economic hypothesis
Two-tail test of significance

p-values

p-value of a right-tail test
p-value of a left-tail test
p-value for a two-tail test
p-values in Stata output
Testing and estimating linear combinations of parameters

Appendix Graphical tools

Appendix Monte Carlo simulation

Fixed x’s
Random x’s

Chapter 3 Do-file

CHAPTER 4 PREDICTION, GOODNESS-OF-FIT AND MODELING ISSUES

Least squares prediction

Editing the data
Estimate the regression and obtain postestimation results
Creating the prediction interval
Using margins to create the prediction Interval

Measuring goodness-of-fit

Correlations and R2

The effects of scaling and transforming the Data

Reporting regression results
The linear-log functional form
Plotting the fitted linear-log model
Editing graphs

Analyzing the residuals

Residual plots
The Jarque-Bera test
Chi-square distribution critical values
Chi-square distribution p-values

Polynomial models

Estimating and checking the linear relationship
Estimating and checking a cubic equation
Estimating a log-linear yield growth model

Estimating a log-linear wage equation

The log-linear model
Calculating wage predictions
Constructing wage plots
Generalized R2
Prediction intervals in the log-linear model
Prediction intervals in the log-linear model using margins

A log-log model

Chapter 4 Do-file

CHAPTER 5 MULTIPLE LINEAR REGRESSION

The Hamburger Chain Model
Least Squares Estimation

Least squares procedure
Least squares prediction
Rescaling the variables
Estimating the error variance
Measuring the goodness-of-fit
Frisch-Waugh-Lovell

Least Squares Precision
Confidence Intervals

Changing the confidence level
Linear combination of parameters

Hypothesis Tests

Two-sided t-test
One-sided t-test
Testing a linear combination of parameters

Interaction Variables

Polynomial regressors
Using factor variables for interactions
Interactions with other variables
Maximizing wages via experience

Appendix Nonlinear functions of a single parameter
Appendix Nonlinear functions of two parameters
Appendix Least squares estimation with chi-square errors
Appendix Monte Carlo simulation of the delta method
Appendix Bootstrapping

Chapter 5 Do-file

CHAPTER 6 FURTHER INFERENCE IN THE MULTIPLE REGRESSION MODEL

Testing joint hypotheses: The F-test

Testing the significance of the model
Relationship between t- and F-tests
More general F-tests
Large sample tests
Nonlinear hypothesis tests

Stata programs
Nonsample information
Model specification

Omitted variables
Irrelevant variables
Choosing the model
6RESET test for function form
RESET program
Control variables
Prediction-forecast error variance
Prediction-model selection and RMSE

Poor data, collinearity, and insignificance

Variance inflation factors
Influential observations

Nonlinear least squares

Chapter 6 Do-file

CHAPTER 7 USING INDICATOR VARIABLES

Indicator variables

Creating indicator variables
Estimating an indicator variableregression
Testing the significance of the indicator Variables
Further calculations
Computing average marginal effects

Applying indicator variables

Interactions between qualitative factors
Testing the equivalence of two regressions
Estimating separate regressions
Indicator variables in log-linear models

The linear probability model
Treatment effects
Differences-in-Differences estimation

Chapter 7 Do-file

CHAPTER 8 HETEROSKEDASTICITY

The nature of heteroskedasticity
Heteroskedastic-consistent standard errors
The generalized least squares estimator

Feasible GLS-a more general case
Fesible GLS with a heteroskedastic partition

Detecting heteroskedasticity

The Goldfeld-Quandt test using partitioned data
The Goldfeld-Quandt test in the food expenditure model
Lagrange multiplier tests
Heteroskedasticity in the linear probability model

Appendix Alternative robust sandwich estimators
Appendix Monte Carlo evidence

Chapter 8 Do-file

CHAPTER 9 REGRESSION WITH TIME-SERIES DATA: STATIONARY VARIABLES

Introduction

Defining time-series in Stata
Time-series plots
Stata’s lag and difference operators

Correlogram
The AR(2) model
Autoregressive distributed lag models

Forecasts and forecast intervals
Model selection
Granger causality

Serial correlation in residuals

Detecting autocorrelation in residuals
Okun’s Law
HAC standard errors
Nonlinear least squares
Feasible GLS

The consumption function
Multipliers for an IDL model
Durbin-Watson Test

Chapter 9 Do-file

CHAPTER 10 ENDOGENOUS REGRESSORS AND MOMENT BASED ESTIMATION

Least squares estimation of a wage equation
Two-stage least squares
IV estimation with surplus instruments

Illustrating partial correlations

The Hausman test for endogeneity
Testing the validity of surplus instruments
Testing for weak instruments
Calculating the Cragg-Donald F-statistic
Illustrations using simulated data
A simulation experiment

Chapter 10 Do-file

CHAPTER 11 SIMULTANEOUS EQUATIONS MODELS

Key Terms
Truffle supply and demand
Estimating the reduced form equations
2SLS estimates of truffle demand
2SLS estimates of truffle supply
Supply and demand of fish
Reduced forms for fish price and quantity
2SLS estimates of fish demand
2SLS alternatives
Monte Carlo simulation

Chapter 11 Do-file 495

CHAPTER 12 REGRESSION WITH TIME-SERIES DATA: NONSTATIONARY VARIABLES

Key Terms
Stationary and nonstationary data

Review: generating dates in Stata
Extracting dates
Graphing the data
Summary statistics using subsamples
Correlogram

Deterministic trends
Spurious regressions
Unit root tests for stationarity

Is GDP trend stationary?
Is wheat yield stationary?

Integration and cointegration

Order of integration
Engle-Granger test
The error correction model
Regression with no cointegration

Chapter 12 Do-file

CHAPTER 13 VECTOR ERROR CORRECTION AND VECTOR AUTOREGRESSIVE MODELS

VEC and VAR models
Estimating a VEC model
Estimating a VAR
Impulse responses and variance decompositions

Chapter 13 do-file

CHAPTER 14 TIME-VARYING VOLATILITY AND ARCH MODELS

Key Terms
ARCH model and time-varying volatility
Simulating ARCH
Testing, estimating and forecasting
Extensions

GARCH
Threshold GARCH
GARCH-in-mean

Chapter 14 Do-file

CHAPTER 15 PANEL DATA MODELS

Key Terms

A microeconomic panel
The fixed-effects estimator

The difference estimator: T = 2
The within estimator: T = 2
The within estimator: T = 3
The fixed-effects estimator: xtreg
The least squares dummy variable estimator
Testing for fixed effects

Panel data regression error assumptions

OLS estimation with cluster-robust standard errors
Fixed-effects estimation with cluster-robust standard errors
Random-effects estimation of a production function
Random-effects estimation of a wage equation
Testing for random-effects
The Hausman contast test for the production function
The Hausman contast test for the wage equation
A regression based Hausman test for the production function
A regression based Hausman test for the wage equation
The Hausman-Taylor estimator

Chapter 15 Do-file

CHAPTER 16 QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Key Terms

Models with binary dependent variables

The linear probability model
Probit: a small example
Probit: the transportation data
Marginal effects
Probit marginal effects: details
Standard error of average marginal effect

The logit model for binary choice

Wald tests
Likelihood ratio tests
Binary choice models with a continuous endogenous variable

Multinomial logit
Conditional logit

Estimation using asclogit

Ordered choice models
Models for count data
Censored data models
Selection bias

Appendix 16D Tobit Monte Carlo experiment

Chapter 16 Do-file

APPENDIX A REVIEW OF MATH ESSENTIALS

Key Terms

Stata math and logical operators
Math functions
Extensions to generate
The calculator
Scientific notation
Logarithms
Numerical derivatives and integrals

Appendix A Do-file

APPENDIX B REVIEW OF PROBABILITY

Key Terms

Stata probability functions
Binomial distribution
Poisson distribution
Normal distribution

Normal density plots
Normal probability calculations

Chi-square distribution

Plotting the chi-square density
Chi-square probability calculations
The non-centaral chi-square pdf

Student’s t-distribution

Plot of standard normal and t(3)
t-distribution probabilities
Graphing tail probabilities
The non-central t-distribution

F-distribution

Plotting the F-density
F-distribution probabililty calculations
The non-central F-distribution

The log-normal distribution
Random numbers
Using inversion method
Creating uniform random numbers

Appendix B Do-file

APPENDIX C REVIEW OF STATISTICAL INFERENCE

Key Terms

Examining the hip data

Constructing a histogram
Obtaining summary statistics
Estimating the population mean

Using simulated data values
The central limit theorem
Estimating population moments
Interval estimation

Computing confidence intervals
Using simulated data
Using the hip data

Testing the mean of a normal population

Right-tail test
Two-tail test

Testing the variance of a normal population
Testing the equality of two normal population means

Population variances are equal
Population variances are unequal

Testing the equality of two normal population variances
Testing normality
Maximum likelihood estimation

Testing a population proportion
Likelihood ratio test
Wald test
Lagrange multiplier test

Least squares
Kernel density estimator

Appendix C Do-file Author: Lee C. Adkins and R. Carter Hill
Edition: Fifth Edition
ISBN978-1-119-46324-5