GOALS
This tutorial demonstrates the GMM estimation of a simple OLS model using the gmmFit and gmmFitIV procedures. After completing this tutorial you should be able to estimate an instrumental variables model using:
- The
gmmFitIVprocedure - The
gmmFitprocedure
INTRODUCTION
In this example, we will expand on the OLS model to estimate an instrumental variables model. We will again demonstrate how to estimate the model using both gmmFit and gmmFitIV. The linear model will examine the relationship between the dependent variable rent and housing values hsngval and the percentage of the population living in urban areas pcturban.
![]()
The data for this model is stored in the GAUSS dataset “hsng.dat”.
The new addition to this model is the endogeneity of the variable hsngval. As a solution for the endogeneity, we will instrument for hsngval using pcturban, family income (faminc) and three regional dummies (reg2, reg3, reg4).
ESTIMATION WITH gmmFitIV
The gmmFitIV procedure uses the GAUSS formula string syntax to set up estimation. In the case of the instrumental variables model you must include three pieces of information to set up the model:
- The dataset name.
- A formula string representing the model.
- An instrumental variable string.
//Dataset dataset = getGAUSShome $+ "examples/hsng.dat"; //Model formula formula = "rent ~ hsngval + pcturban"; //String of instrumental variables inst_var = "pcturban + faminc + reg2 + reg3 + reg4"; call gmmFitIV(dataset, formula, inst_var);
Note: We are using the formula string syntax to specify our model. Using this syntax, our instrumental variable string is constructed as a list of variables separated by a +, variable_1 + variable_2 + ... + variable_k.
The output from our gmmFitIV estimation reads
Dependent Variable: rent
Number of Observations: 50
Number of Moments: 0
Number of Parameters: 3
Degrees of freedom: 47
Standard Prob
Variable Estimate Error t-value >|t|
-----------------------------------------------------------
CONSTANT 112.122713 10.545763 10.632 0.000
hsngval 0.001464 0.000404 3.627 0.001
pcturban 0.761548 0.264387 2.880 0.006
Instruments: pcturban, faminc, reg2, reg3, reg4, Constant
Hansen Test Statistic of the Moment Restrictions
Chi-Sq(3) = 6.9753314
P-value of J-stat: 0.072688216
ESTIMATION WITH gmmFit
Load Data
When using the gmmFit procedure, we must start our estimation by loading our data into data matrices and separating our data into three different data matrices y, x, and z.
//Load data file data = loadd(getGAUSShome $+ "examples/hsng.dat","rent + hsngval + pcturban + faminc + reg2 + reg3 + reg4"); //Extract x and y matrix y = data[., 1]; x = data[., 2:3]; //Extract instrumental variables matrix z = data[., 3:7]; //Add constant to z z = ones(rows(z), 1)~z;
Write moment equation
The next step for our gmmFit estimation is to define our moment procedure. The instrumental variable model uses moments based on E[ztut(θ0) = yt – βtxt. Note that the resulting moment equation now has four total inputs because of the addition of z to the inputs.
proc meqn(b, yt, xt, zt); local ut, dt; /** OLS resids **/ ut = yt - b[1] - b[2]*xt[., 1] - b[3]*xt[., 2]; /** Moment conditions **/ dt = ut .* zt; retp(dt); endp;
Set model parameters
For this example, rather than setting specific starting values for the parameters, we will specify the number of parameters to be estimated using gctl.numParams. This specification will allow GAUSS to find starting parameters.
© 2025 Aptech Systems, Inc. All rights reserved.
//Declare gctl to be a gmmControl struct //and fill with default settings struct gmmControl gctl; gctl = gmmControlCreate(); //Set starting values gctl.numParams = 3;
We will also set up the initial weight matrix for the gmmFit estimation so it will replicate the default model of the gmmFitIV procedure.
In this model, the exogenous variables are contained in the data matrix z and the default initial weight matrix used by gmmFitIV will be equal to 1N(Z′Z)−1. We can specify for gmmFit to use the same matrix using the gmmControl member gctl.wInitMat
//Set initial weight matrix gctl.wInitMat = invpd((1/rows(z))*(z'z));
Finally, we add variable names. This time we wish to add both the model variable names using gctl.varNames and the instrument names using gctl.instNames
//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };
//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };
Call gmmFit
We are now ready to call gmmFit. Notice that this time z must be included as an input into gmmFit
call gmmFit(&meqn, y, x, z, gctl);
The output from our gmmFit estimation reads
Dependent Variable: rent
Number of Observations: 50
Number of Moments: 6
Number of Parameters: 3
Degrees of freedom: 47
Standard Prob
Variable Estimate Error t-value >|t|
-----------------------------------------------------------
CONSTANT 112.122790 10.545745 10.632 0.000
hsngval 0.001464 0.000404 3.627 0.001
pcturban 0.761552 0.264387 2.880 0.006
CONCLUSION
Congratulations! You have:
- Estimated an instrumental variables model using
gmmFitIV. - Estimated an instrumental variables model using
gmmFit.
For convenience, the full program text is reproduced below.
//Dataset
dataset = getGAUSShome $+ "examples/hsng.dat";
//Model formula
formula = "rent ~ hsngval + pcturban";
//String of instrumental variables
inst_var = "pcturban + faminc + reg2 + reg3 + reg4";
call gmmFitIV(dataset, formula, inst_var);
//Load data file
data = loadd(getGAUSShome $+ "examples/hsng.dat","rent + hsngval +
pcturban + faminc +
reg2 + reg3 + reg4");
//Extract x and y matrix
y = data[., 1];
x = data[., 2:3];
//Extract instrumental variables matrix
z = data[., 3:7];
//Add constant to z
z = ones(rows(z),1)~z;
//Declare gctl to be a gmmControl struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();
//Set starting values
gctl.numParams = 3;
//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(z))*(z'z));
//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };
//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };
call gmmFit(&meqn, y, x, z, gctl);
proc meqn(b, yt, xt, zt);
local ut,dt;
/** OLS resids **/
ut = yt - b[1] - b[2]*xt[.,1] - b[3]*xt[.,2];
/** Moment conditions **/
dt = ut .* zt;
retp(dt);
endp;
