[R] gmm -- Generalized method of moments estimation
Syntax
Interactive version
gmm ([reqname1:]rexp_1) ([reqname2:]rexp_2) ... [if] [in] [weight]
[, options]
Moment-evaluator program version
gmm moment_prog [if] [in] [weight],
{equations(namelist)|nequations(#)}
{parameters(namelist)|nparameters(#)} [options]
[program_options]
reqname_j is the jth residual equation name,
rexp_j is the substitutable expression for the jth residual equation, and
moment_prog is a moment-evaluator program.
options Description
-------------------------------------------------------------------------
Model
derivative([reqname|#]/name = dexp_jk)
specify derivative of reqname (or #) with
respect to parameter name; can be
specified more than once (interactive
version only)
* twostep use two-step GMM estimator; the default
* onestep use one-step GMM estimator
* igmm use iterative GMM estimator
variables(varlist) specify variables in model
nocommonesample do not restrict estimation sample to be the
same for all equations
Instruments
instruments([reqlist:]varlist[, noconstant])
specify instruments; can be specified more
than once
xtinstruments([reqlist:]varlist, lags(#_1/#_2))
specify panel-style instruments; can be
specified more than once
Weight matrix
wmatrix(wmtype[, independent])
specify weight matrix; wmtype may be
robust, cluster clustvar, hac kernel
[lags], or unadjusted
center center moments in weight-matrix computation
winitial(iwtype[, independent])
specify initial weight matrix; iwtype may
be unadjusted, identity, xt xtspec, or
the name of a Stata matrix
SE/Robust
vce(vcetype[, independent]) vcetype may be robust, cluster clustvar,
bootstrap, jackknife, hac kernel lags, or
unadjusted
quickderivatives use alternative method of computing
numerical derivatives for VCE
Reporting
level(#) set confidence level; default is level(95)
title(string) display string as title above the table of
parameter estimates
title2(string) display string as subtitle
display_options control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling
Optimization
from(initial_values) specify initial values for parameters
# igmmiterate(#) specify maximum number of iterations for
iterated GMM estimator
# igmmeps(#) specify # for iterated GMM parameter
convergence criterion; default is
igmmeps(1e-6)
# igmmweps(#) specify # for iterated GMM weight-matrix
convergence criterion; default is
igmmweps(1e-6)
optimization_options control the optimization process; seldom
used
coeflegend display legend instead of statistics
-------------------------------------------------------------------------
* You can specify at most one of these options.
# These options may be specified only when igmm is specified.
program_options Description
-------------------------------------------------------------------------
Model
evaluator_options additional options to be passed to the
moment-evaluator program
+ hasderivatives moment-evaluator program can calculate
parameter-level derivatives
+ haslfderivatives moment-evaluator program can calculate
linear-form derivatives
* equations(namelist) specify residual equation names
* nequations(#) specify number of residual equations
# parameters(namelist) specify parameter names
# nparameters(#) specify number of parameters
-------------------------------------------------------------------------
+ You may not specify both hasderivatives and haslfderivatives.
* You must specify equations(namelist) or nequations(#); you may specify
both.
# You must specify parameters(namelist) or nparameters(#); you may
specify both.
rexp_j and dexp_jk may contain factor variables and time-series
operators; see fvvarlist and tsvarlist.
bootstrap, by, jackknife, rolling, and statsby are allowed; see prefix.
Weights are not allowed with the bootstrap prefix.
aweights are not allowed with the jackknife prefix.
aweights, fweights, iweights, and pweights are allowed; see weight.
coeflegend does not appear in the dialog box.
See [R] gmm postestimation for features available after estimation.
rexp_j and dexp_jk are substitutable expressions, that is, Stata
expressions that also contain parameters to be estimated. The parameters
are enclosed in curly braces and must satisfy the naming requirements for
variables; {beta} is an example of a parameter. The notation
{lcname:varlist} is allowed for linear combinations of multiple
covariates and their parameters. For example, {xb: mpg price turn _cons}
defines a linear combination of the variables mpg, price, turn, and _cons
(the constant term). See Substitutable expressions under Remarks and
examples of [R] gmm.
Menu
Statistics > Endogenous covariates > Generalized method of moments
estimation
Description
gmm performs generalized method of moments (GMM) estimation. With the
interactive version of the command, you enter the residual equation for
each moment condition directly into the dialog box or on the command line
by using substitutable expressions. The moment-evaluator program version
gives you greater flexibility in exchange for increased complexity; with
this version, you write a program in an ado-file that calculates the
moments based on a vector of parameters passed to it.
gmm can fit both single- and multiple-equation models. It allows moment
conditions of the form E{z_i u_i(b)} = 0, where z_i is a vector of
instruments, and u_i(b) is an error term, as well as more general moment
conditions of the form E{h_i(z_i;b)} = 0. gmm works with
cross-sectional, time-series, and longitudinal (panel) data.
Options
+-------+
----+ Model +------------------------------------------------------------
derivative([reqname|#]/name = dexp_jk) specifies the derivative of
residual equation reqname or # with respect to parameter name. If
reqname or # is not specified, gmm assumes that the derivative
applies to the first residual equation.
For a moment condition of the form E{z_ji u_ji(b)} = 0,
derivative(j/b_k = dexp_jk) is to contain a substitutable expression
for du_ji / db_k. If you specified m as the reqname, then for a
moment condition of the form E{z_mi u_mi(b)} = 0, you can specify
derivative(m/b_k = dexp_mk), where m is the index of m.
dexp_jk uses the same substitutable expression syntax as is used to
specify residual equations. If you declare a linear combination in a
residual equation, you provide the derivative for the linear
combination; gmm then applies the chain rule for you. See example 4
below.
If you do not specify the derivative() option, gmm calculates
derivatives numerically. You must either specify no derivatives or
specify a derivative for each of the k parameters that appears in
each of the j residual equations unless the derivative is identically
zero. You cannot specify some analytic derivatives and have gmm
compute the rest numerically.
twostep, onestep, and igmm specify which estimator is to be used. You
can specify at most one of these options. twostep is the default.
twostep requests the two-step GMM estimator. gmm obtains parameter
estimates based on the initial weight matrix, computes a new weight
matrix based on those estimates, and then reestimates the parameters
based on that weight matrix.
onestep requests the one-step GMM estimator. The parameters are
estimated based on an initial weight matrix, and no updating of the
weight matrix is performed except when calculating the appropriate
variance-covariance (VCE) matrix.
igmm requests the iterative GMM estimator. gmm obtains parameter
estimates based on the initial weight matrix, computes a new weight
matrix based on those estimates, reestimates the parameters based on
that weight matrix, computes a new weight matrix, and so on, to
convergence. Convergence is declared when the relative change in the
parameter vector is less than igmmeps(), the relative change in the
weight matrix is less than igmmweps(), or igmmiterate() iterations
have been completed. Hall (2005, sec. 2.4 and 3.6) mentions that
there may be gains to finite-sample efficiency from using the
iterative estimator.
variables(varlist) specifies the variables in the model. gmm ignores
observations for which any of these variables has a missing value. If
you do not specify variables(), then gmm assumes all the observations
are valid and issues an error message if any residual equations
evaluate to missing for any observations at the initial value of the
parameter vector.
nocommonesample requests that gmm not restrict the estimation sample to
be the same for all equations. By default, gmm will restrict the
estimation sample to observations that are available for all
equations in the model, mirroring the behavior of other
multiple-equation estimators such as nlsur, sureg, or reg3. For
certain models, however, different equations can have different
numbers of observations. For these models, you should specify
nocommonesample. See the dynamic panel-data examples for one type of
model where this option is needed. You cannot specify weights if you
specify nocommonesample.
+-------------+
----+ Instruments +------------------------------------------------------
instruments([reqlist:] varlist[, noconstant]) specifies a list of
instrumental variables to be used. If you specify a single residual
equation, then you do not need to specify the equations to which the
instruments apply; you can omit the reqlist and simply specify
instruments(varlist). By default, a constant term is included in
varlist; to omit the constant term, use the noconstant suboption:
instruments(varlist, noconstant).
If your model has multiple moment conditions of the form
{ z1_i u1(b)_i }
E{ ............ } = 0
{ zq_i uq(b)_i }
then you can specify multiple corresponding residual equations. Then
specify the reqname or an reqlist to indicate the residual equations
for which the list of variables is to be used as instruments if you
do not want that list applied to all the residual equations. For
example, you might type
gmm (main: rexp_1) (rexp_2) (rexp_3), instruments(z1 z2)
instruments(2: z3) instruments(main 3: z4)
Variables z1 and z2 will be used as instruments for all three
equations, z3 will be used as an instrument for the second equation,
and z4 will be used as an instrument for the first and third
equations. Notice that we chose to supply a name for the first
residual equation but not the second two, identifying each by its
equation number.
varlist may contain factor variables and time-series operators; see
fvvarlist and tsvarlist, respectively.
xtinstruments([reqlist:] varlist, lags(#_1/#_2)) is for use with
panel-data models in which the set of available instruments depends
on the time period. As with instruments(), you can prefix the list
of variables with residual equation names or numbers to target
instruments to specific equations. Unlike with instruments(), a
constant term is not included in varlist. You must xtset your data
before using this option; see xtset.
If you specify
gmm ..., xtinstruments(x, lags(1/.)) ...
then for panel i and period t, gmm uses x_(i,t-1), x_(i,t-2), ...,
x_i1 as instruments. More generally, specifying xtinstruments(x,
lags(#_1,#_2)) uses x_(i,t-#_1), ..., x_(i,t-#_2) as instruments;
setting #_2 = . requests all available lags. #_1 and #_2 must be
zero or positive integers.
gmm automatically excludes observations for which no valid
instruments are available. It does, however, include observations for
which only a subset of the lags is available. For example, if you
request that lags one through three be used, then gmm will include
the observations for the second and third time periods even though
fewer than three lags are available as instruments.
+---------------+
----+ Weight matrix +----------------------------------------------------
wmatrix(wmtype[, independent]) specifies the type of weight matrix to be
used in conjunction with the two-step and iterated GMM estimators.
Specifying wmatrix(robust) requests a weight matrix that is
appropriate when the errors are independent but not necessarily
identically distributed. wmatrix(robust) is the default.
Specifying wmatrix(cluster clustvar) requests a weight matrix that
accounts for arbitrary correlation among observations within clusters
identified by clustvar.
Specifying wmatrix(hac kernel #) requests a heteroskedasticity- and
autocorrelation-consistent (HAC) weight matrix using the specified
kernel (see below) with # lags. The bandwidth of a kernel is equal
to the number of lags plus one.
Specifying wmatrix(hac kernel opt [#]) requests an HAC weight matrix
using the specified kernel, and the lag order is selected using Newey
and West's (1994) optimal lag-selection algorithm. # is an optional
tuning parameter that affects the lag order selected; see the
discussion in [R] gmm.
Specifying wmatrix(hac kernel) requests an HAC weight matrix using
the specified kernel and N-2 lags, where N is the sample size.
There are three kernels available for HAC weight matrices, and you
can request each one by using the name used by statisticians or the
name perhaps more familiar to economists:
bartlett or nwest requests the Bartlett (Newey-West) kernel;
parzen or gallant requests the Parzen (Gallant) kernel; and
quadraticspectral or andrews requests the quadratic spectral
(Andrews) kernel.
Specifying wmatrix(unadjusted) requests a weight matrix that is
suitable when the errors are homoskedastic. In some applications,
the GMM estimator so constructed is known as the (nonlinear)
two-stage least-squares (2SLS) estimator.
Including the independent suboption creates a weight matrix that
assumes moment conditions are independent. This suboption is often
used to replicate other models that can be motivated outside the GMM
framework, such as the estimation of a system of equations by
system-wide 2SLS. This suboption has no effect if only one residual
equation is specified.
wmatrix() has no effect if onestep is also specified.
center requests that the sample moments be centered (demeaned) when
computing GMM weight matrices. By default, centering is not done.
winitial(iwtype[, independent]) specifies the weight matrix to use to
obtain the first-step parameter estimates.
Specifying winitial(unadjusted) requests a weight matrix that assumes
the moment conditions are independent and identically distributed.
This matrix is of the form (Z'Z)^-1, where Z represents all the
instruments specified in the instruments() option. To avoid a
singular weight matrix, you should specify at least q-1 moment
conditions of the form E{z_hi u_hi(b)} = 0, where q is the number of
moment conditions, or you should specify the independent suboption.
Including the independent suboption creates a weight matrix that
assumes moment conditions are independent. Elements of the weight
matrix corresponding to covariances between two moment conditions are
set equal to zero. This suboption has no effect if only one residual
equation is specified.
winitial(unadjusted) is the default.
winitial(identity) requests that the identity matrix be used.
winitial(xt xtspec) is for use with dynamic panel-data models in
which one of the residual equations is specified in first-differences
form. xtspec is a string consisting of the letters "L" and "D", the
length of which is equal to the number of residual equations in the
model. You specify "L" for a residual equation if that residual
equation is written in levels, and you specify "D" for a residual
equation if it is written in first differences; xtspec is not case
sensitive. When you specify this option, you can specify at most one
residual equation in levels and one residual equation in first
differences. See the dynamic panel-data examples below.
winitial(matname) requests that Stata matrix matname be used. You
cannot specify the independent suboption if you specify
winitial(matname).
+-----------+
----+ SE/Robust +--------------------------------------------------------
vce(vcetype [, independent]) specifies the type of standard error
reported, which includes types that are robust to some kinds of
misspecification (robust), that allow for intragroup correlation
(cluster clustvar), and that use bootstrap or jackknife methods
(bootstrap, jackknife); see [R] vce_option.
vce(unadjusted) specifies that an unadjusted (nonrobust) VCE matrix
be used; this, along with the twostep option, results in the "optimal
two-step GMM" estimates often discussed in textbooks.
The default vcetype is based on the wmtype specified in the wmatrix()
option. If wmatrix() is specified but vce() is not, then vcetype is
set equal to wmtype. To override this behavior and obtain an
unadjusted (nonrobust) VCE matrix, specify vce(unadjusted).
Specifying vce(bootstrap) or vce(jackknife) results in standard
errors based on the bootstrap or jackknife, respectively. See [R]
vce_option, [R] bootstrap, and [R] jackknife for more information on
these VCEs.
The syntax for vcetypes other than bootstrap and jackknife is
identical to those for wmatrix().
quickderivatives requests that an alternative method be used to compute
the numerical derivatives for the VCE. This option has no effect if
you specify the derivatives(), hasderivatives, or haslfderivatives
option.
The VCE depends on a matrix of partial derivatives that gmm must
compute numerically unless you supply analytic derivatives. This
Jacobian matrix will be especially large if your model has many
instruments, residual equations, or parameters.
By default, gmm computes each element of the Jacobian matrix
individually, searching for an optimal step size each time. Although
this procedure results in accurate derivatives, it is computationally
taxing: gmm may have to evaluate the moments of your model five or
more times for each element of the Jacobian matrix.
When you specify the quickderivatives option, gmm computes all
derivatives corresponding to a parameter at once, using a fixed step
size proportional to the parameter's value. This method requires
just two evaluations of the model's moments to compute an entire
column of the Jacobian matrix and therefore has the most impact when
you specify many instruments or residual equations.
Most of the time, the two methods produce virtually identical
results, but the quickderivatives method may fail if a residual
equation is highly nonlinear or if instruments differ by orders of
magnitude. In the rare case where you specify quickderivatives and
obtain suspiciously large or small standard errors, try refitting
your model without this option.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#); see [R] estimation options.
title(string) specifies an optional title that will be displayed just
above the table of parameter estimates.
title2(string) specifies an optional subtitle that will be displayed
between the title specified in title() and the table of parameter
estimates. If title2() is specified but title() is not, title2() has
the same effect as title().
display_options: noci, nopvalues, noomitted, vsquish, noemptycells,
baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style),
cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R]
estimation options.
+--------------+
----+ Optimization +-----------------------------------------------------
from(initial_values) specifies the initial values to begin the
estimation. You can specify a parameter name, its initial value,
another parameter name, its initial value, and so on, or you can
specify a 1 x k matrix, where k is the number of parameters in the
model. For example, to initialize alpha to 1.23 and delta to 4.57,
you would type
gmm ..., from(alpha 1.23 delta 4.57) ...
or equivalently
matrix define initval = (1.23, 4.57)
gmm ..., from(initval) ...
Initial values declared in the from() option override any that are
declared within substitutable expressions. If you specify a
parameter that does not appear in your model, gmm exits with an error
message. If you specify a matrix, the values must be in the same
order in which the parameters are declared in your model.
igmmiterate(#), igmmeps(#), and igmmweps(#) control the iterative process
for the iterative GMM estimator. These options can be specified only
if you also specify igmm.
igmmiterate(#) specifies the maximum number of iterations to perform
with the iterative GMM estimator. The default is the number set
using set maxiter, which is 16,000 by default.
igmmeps(#) specifies the convergence criterion used for successive
parameter estimates when the iterative GMM estimator is used.
The default is igmmeps(1e-6). Convergence is declared when the
relative difference between successive parameter estimates is
less than igmmeps() and the relative difference between
successive estimates of the weight matrix is less than
igmmweps().
igmmweps(#) specifies the convergence criterion used for successive
estimates of the weight matrix when the iterative GMM estimator
is used. The default is igmmweps(1e-6). Convergence is declared
when the relative difference between successive parameter
estimates is less than igmmeps() and the relative difference
between successive estimates of the weight matrix is less than
igmmweps().
optimization_options: technique(), conv_maxiter(), conv_ptol(),
conv_vtol(), conv_nrtol(), tracelevel(). technique() specifies the
optimization technique to use; gn (the default), nr, dfp, and bfgs
are allowed. conv_maxiter() specifies the maximum number of
iterations; conv_ptol(), conv_vtol(), and conv_nrtol() specify the
convergence criteria for the parameters, gradient, and scaled
Hessian, respectively. tracelevel() allows you to obtain additional
details during the iterative process. See [M-5] optimize().
The following options pertain only to the moment-evaluator program
version of gmm:
+-------+
----+ Model +------------------------------------------------------------
evaluator_options refer to any options allowed by your moment_prog.
hasderivatives and haslfderivatives indicate that you have written your
moment-evaluator program to compute derivatives. You may specify one
or the other but not both. If you do not specify either of these
options, gmm computes the derivatives numerically.
hasderivatives indicates that your moment-evaluator program computes
parameter-level derivatives.
haslfderivatives indicates that your moment-evaluator program
computes equation-level derivatives and is useful only when you
specify the parameters of your model using the {lcname:varlist}
syntax of the parameters() option.
See Details of moment-evaluator programs in [R] gmm for more
information.
equations(namelist) specifies the names of the residual equations in the
model. If you specify both equations() and nequations(), the number
of names in the former must match the number specified in the latter.
nequations(#) specifies the number of residual equations in the model.
If you do not specify names with the equations() option, gmm numbers
the residual equations 1, 2, 3, .... If you specify both equations()
and nequations(), the number of names in the former must match the
number specified in the latter.
parameters(namelist) specifies the names of the parameters in the model.
The names of the parameters must comply with the naming conventions
of Stata's variables; see [U] 11.3 Naming conventions.
Alternatively, you can use parameter equation notation to specify
linear combinations of parameters. Each linear combination is of the
form {lcname:varlist}, where varlist is one or more variable names.
Specify the system variable _cons in varlist to include a constant
term. Distinguish between {lcname:varlist}, in which lcname
identifies the linear combination, and (reqname:rex), in which
reqname identifies the residual equation. When you use
linear-combination syntax, gmm prepends each element of the parameter
vector passed to your evaluator program with lcname: to generate
unique names.
If you specify both parameters() and nparameters(), the number of
names in the former must match the number specified in the latter.
nparameters(#) specifies the number of parameters in the model. If you
do not specify names with the parameters() option, gmm names them b1,
b2, ..., b#. If you specify both parameters() and nparameters(), the
number of names in the former must match the number specified in the
latter.
The following option is available with gmm but is not shown in the dialog
box:
coeflegend; see [R] estimation options.
Remarks
Remarks are presented under the following headings:
Interactive version
Moment-evaluator program version
Substitutable expressions
Interactive version
In many applications, the moment conditions can be written in the form
E{z_i u_i(b)} = 0
where i indexes observations, b is a p x 1 vector of parameters, u(b) is
a residual term, and z represents a vector of one or more instrumental
variables, z1, z2, ..., zq. Here you would type
. gmm (<expression for u_i(b)>), instruments(z1, z2, ..., zq)
In other applications, we cannot write the moment conditions as the
product of a residual and a list of instruments but instead have the more
general moment conditions
E{h_i(b)} = 0
where h(b) is a q x 1 vector-valued function. Here you would type
. gmm (<expression for h_1i(b)>) (<expression for h_2i(b)>) ...
(<expression for h_qi(b)>)
where h_1i(b) is the first element of h(b), and so on.
In yet other applications, your moment conditions might be of the form
{ z_1i u_1i(b) }
E{ ............ } = 0
{ z_qi u_qi(b) }
where z_1i is a vector of instrumental variables z11, z12, ..., z1q1,
associated with the first residual term, u_1i(b), and so on. Here you
would type
. gmm (<expression for u_1i(b)>)
(<expression for u_2i(b)>) ...
(<expression for u_qi(b)>),
instruments(1: z11 z12 ... z1q1)
instruments(2: z21 z22 ... z2q2) ...
instruments(3: z31 z32 ... z3q3)
Of course, you can also combine moment conditions of the forms E{h_i(b)}
= 0 and E{z_ki u_ki(b)} = 0.
Moment-evaluator program version
Instead of defining the moment equations in the dialog box or on the
command line, you can write a program that evaluates them similarly to
how ml and the function-evaluator program version of nl work. We
illustrate the mechanics of a moment-evaluator program through a simple
example. Suppose we wish to fit the model
y_i = x_i'b + u_i
where we suspect that some elements of x are endogenous. We have as
instruments the vector z, consisting of the elements of x that are
exogenous and additional variables not correlated with u_i. In a GMM
framework, we can write our moment conditions as
E{z_i u_i(b)} = E{z_i(y_i - x_i'b)} = 0
Our first attempt at a moment-evaluator program is
program gmm_ivreg
version 15.1
syntax varlist [if] , at(name) rhs(varlist) depvar(varlist)
tempvar m
quietly gen double `m' = 0 `if'
local i 1
foreach var of varlist `rhs' {
quietly replace `m' = `m' + `var'*`at'[1,`i'] `if'
local `++i'
}
quietly replace `m' = `m' + `at'[1,`i'] `if' // constant
quietly replace `varlist' = `depvar' - `m' `if'
end
Say that our dependent variable, y_i, is mpg; x consists of gear_ratio,
turn, and a constant; and z consists of gear_ratio, length, headroom, and
a constant. Then, to fit our model, we would type
. gmm gmm_ivreg, nequations(1) nparameters(3)
instruments(gear_ratio length headroom) depvar(mpg)
rhs(gear_ratio turn)
First, notice that depvar() and rhs() are not options that the gmm
command recognizes. Therefore, gmm will pass those options to our
moment-evaluator program.
Our moment-evaluator program accepts a varlist. gmm will pass to our
program q variables in this varlist, where q is the number of moment
equations specified in the nequations() or equations() option. Because,
in our command, we specified nequations(1), the varlist will contain one
variable, which we are to fill in with our single moment equation u_i(b)
= y_i - x_i'b.
The parameter vector at which we are to evaluate our moments is passed in
the required at() option; all moment-evaluator programs must accept this
option. In our calling command, we specified nparameters(3), so the `at'
vector passed to our program will be 1 x 3.
We wrote our moment-evaluator program to also accept the depvar() and
rhs() options. That way, we can fit other regression models with
endogenous regressors simply by changing the variables we specify in
those options and the instruments() option. Unlike commands such as
ivregress designed specifically for linear regression with endogenous
regressors, with gmm we must specify the complete instrument list,
including exogenous regressors, in the instruments() option.
Our program also accepts an if condition because that is how gmm
communicates the estimation sample. For all the commands that operate on
variables, we include the expression `if' to restrict their operations to
the estimation sample.
The method we just explained can be used to fit an arbitrary GMM model.
When some of the moments are linear in the parameters, we can instead
specify full equation names in the parameters() option and use matrix
score to compute linear combinations of variables rather than having to
loop through each variable as in our previous program. Thus we can write
the moment-evaluator program for our example as follows:
program gmm_ivreg_2
version 15.1
syntax varlist [if] , at(name) depvar(varlist)
tempvar xb
matrix score double `xb' = `at' `if', eq(#1)
quietly replace `varlist' = `depvar' - `xb' `if'
end
Now, to fit our model, we type
. gmm gmm_ivreg_2, nequations(1) depvar(mpg)
parameters({mpg:gear_ratio turn _cons})
instruments(gear_ratio length headroom)
Because we specify full equation and variable names for each parameter,
the columns of the `at' vector passed to our program will be labeled so
that matrix score can compute the linear combination, and we no longer
need to include an option to pass the variable names into our program.
For simplicity, we included an option to specify the dependent variable,
but we could have used Stata's extended macro functions to obtain it from
the `at' vector as well.
Substitutable expressions
You use substitutable expressions with the interactive and programmed
substitutable-expression versions of gmm to define your system of
equations. Substitutable expressions are just like any other
mathematical expression in Stata, except that the parameters of your
model are bound in braces.
You specify a substitutable expression for each equation in your system,
and you must follow three rules:
1. Parameters of the model are bound in curly braces: {b0},
{param}, etc. Parameter names must follow the same conventions
as variable names; see [U] 11.3 Naming conventions.
2. Initial values for parameters are given by including an equal
sign and the initial value inside the curly braces: {b0=1},
{param=3.571}, etc.
You can also specify initial values by using the from() option.
Initial values specified in from() override whatever initial
values are given within the substitutable expression. If you do
not specify an initial value for a parameter, it is initialized
to 0.
3. Linear combinations of variables can be included using the
notation {lc:varlist}: {xb: mpg price weight _cons}, {score: w x
z}, etc. Parameters of linear combinations are initialized to 0.
Substitutable expressions can include any mathematical expression
involving scalars and variables. See operator and exp for more
information on expressions.
Examples
Simple linear regression
. sysuse auto
. regress mpg gear_ratio turn
. gmm (mpg - {b1}*gear_ratio - {b2}*turn - {b0}),
instruments(gear_ratio turn)
Same as above, with analytic derivatives
. gmm (mpg - {b1}*gear_ratio - {b2}*turn - {b0}),
instruments(gear_ratio turn) derivative(/b1 = -1*gear_ratio)
derivative(/b2 = -1*turn) derivative(/b0 = -1)
Simple linear regression, using a linear combination
. gmm (mpg - {xb:gear_ratio turn} - {b0}), instruments(gear_ratio
turn)
Same as above, with analytic derivatives
. gmm (mpg - {xb:gear_ratio turn} - {b0}), instruments(gear_ratio
turn) derivative(/xb = -1) derivative(/b0 = -1)
Two-stage least squares (same as ivregress 2sls)
. ivregress 2sls mpg gear_ratio (turn = weight length headroom)
. gmm (mpg - {b1}*turn - {b2}*gear_ratio - {b0}),
instruments(gear_ratio weight length headroom) onestep
Two-step GMM estimation (same as ivregress gmm)
. ivregress gmm mpg gear_ratio (turn = weight length headroom)
. gmm (mpg - {b1}*turn - {b2}*gear_ratio - {b0}),
instruments(gear_ratio weight length headroom) wmatrix(robust)
Estimation of the parameters of the gamma distribution (Greene 2018, 493)
. webuse greenegamma
. gmm (y - {P}/{lambda})
(y^2 - {P}*({P}+1)/{lambda}^2)
(ln(y) - digamma({P}) + ln({lambda}))
(1/y - {lambda}/({P}-1)),
from(P 2.41 lambda 0.08) winitial(identity)
Same as above, with analytic derivatives
. gmm (y - {P}/{lambda})
(y^2 - {P}*({P}+1)/{lambda}^2)
(ln(y) - digamma({P}) + ln({lambda}))
(1/y - {lambda}/({P}-1)),
from(P 2.41 lambda 0.08)
winitial(identity)
deriv(1/P = -1/{lambda})
deriv(2/P = -(2*{P}+1)/{lambda}^2)
deriv(3/P = -1*trigamma({P}))
deriv(4/P = {lambda}/({P}-1)^2)
deriv(1/lambda = {P}/{lambda}^2)
deriv(2/lambda = 2*{P}*({P}+1)/{lambda}^3)
deriv(3/lambda = 1/{lambda})
deriv(4/lambda = -1/({P}-1))
Estimation of a consumption CAPM model with one financial asset, using
first and second lags of consumption growth and two lags of returns as
instruments (Hamilton 1994, sec. 14.2)
. webuse cr
. generate clc = c / L.c
. generate lcllc = L.c / L2.c
. gmm (1 - {b=1}*(1+F.r)*(F.c/c)^(-1*{g})), inst(clc lcllc r L.r
L2.r)
Exponential (Poisson) regression with endogenous regressor income
. webuse docvisits, clear
. gmm (docvis - exp({xb:private chronic female income} + {b0})),
instruments(private chronic female age black hispanic) onestep
Same as above, specifying analytic derivatives and using the two-step
estimator
. gmm (docvis - exp({xb:private chronic female income} + {b0})),
instruments(private chronic female age black hispanic) deriv(/xb
= -1*exp({xb:} + {b0})) deriv(/b0 = -1*exp({xb:} + {b0})) twostep
Using gmm to fit a maximum likelihood model (probit)
. webuse probitgmm
. global Phi "normal({b0}+{b1}*x)"
. global phi "normalden({b0}+{b1}*x)"
. gmm (y*$phi/$Phi - (1-y)*$phi/(1-$Phi)) ( (y*$phi/$Phi -
(1-y)*$phi/(1-$Phi))*x) winitial(identity) onestep
Using gmm to fit a nonlinear least-squares model (probit)
. global Phi "normal({b0}+{b1}*x)"
. global phi "normalden({b0}+{b1}*x)"
. gmm ( (y - $Phi)*(-x*$phi) ) ( (y - $Phi)*(-1*$phi) )
winitial(identity) onestep
. nl (y = $Phi)
Using gmm to fit a dynamic panel-data model
. webuse abdata
. xtdpdsys n L(0/1).w, lags(1) twostep
. gmm (n - {rho}*L.n - {w}*w - {lagw}*L.w - {c})
(D.n - {rho}*LD.n - {w}*D.w - {lagw}*LD.w),
xtinstruments(1:D.n, lags(1/1))
xtinstruments(2:n, lags(2/.))
instruments(2:D.w LD.w, noconstant)
deriv(1/rho = -1*L.n)
deriv(1/w = -1*w)
deriv(1/lagw = -1*L.w)
deriv(1/c = -1)
deriv(2/rho = -1*LD.n)
deriv(2/w = -1*D.w)
deriv(2/lagw = -1*LD.w)
winitial(xt LD) wmatrix(robust) vce(unadjusted)
variables(L.n w L.w)
twostep nocommonesample
Using gmm to fit a dynamic panel-data model with predetermined
coterminous regressor k
. xtdpdsys n L(0/1).w, pre(k) lags(1) twostep
. gmm (n - {rho}*L.n - {k}*k - {w}*w - {lagw}*L.w - {c})
(D.n - {rho}*LD.n - {k}*D.k - {w}*D.w - {lagw}*LD.w),
xtinstruments(1:D.n, lags(1/1))
xtinstruments(1:D.k, lags(0/0))
xtinstruments(2:n, lags(2/.))
xtinstruments(2:k, lags(1/.))
instruments(2:D.w LD.w, noconstant)
deriv(1/rho = -1*L.n)
deriv(1/k = -1*k)
deriv(1/w = -1*w)
deriv(1/lagw = -1*L.w)
deriv(1/c = -1)
deriv(2/rho = -1*LD.n)
deriv(2/k = -1*D.k)
deriv(2/w = -1*D.w)
deriv(2/lagw = -1*LD.w)
winitial(xt LD) wmatrix(robust) vce(unadjusted)
variables(L.n w L.w)
twostep nocommonesample
Stored results
gmm stores the following in e():
Scalars
e(N) number of observations
e(k) number of parameters
e(k_eq) number of equations in e(b)
e(k_eq_model) number of equations in overall model test
e(k_aux) number of auxiliary parameters
e(n_moments) number of moments
e(n_eq) number of equations in moment-evaluator program
e(Q) criterion function
e(J) Hansen J chi-squared statistic
e(J_df) J statistic degrees of freedom
e(k_i) number of parameters in equation i
e(has_xtinst) 1 if panel-style instruments specified, 0 otherwise
e(N_clust) number of clusters
e(type) 1 if interactive version, 2 if moment-evaluator
program version
e(rank) rank of e(V)
e(ic) number of iterations used by iterative GMM
estimator
e(converged) 1 if converged, 0 otherwise
Macros
e(cmd) gmm
e(cmdline) command as typed
e(title) title specified in title()
e(title_2) title specified in title2()
e(clustvar) name of cluster variable
e(inst_i) equation i instruments
e(eqnames) equation names
e(winit) initial weight matrix used
e(winitname) name of user-supplied initial weight matrix
e(estimator) onestep, twostep, or igmm
e(rhs) variables specified in variables()
e(params_i) equation i parameters
e(wmatrix) wmtype specified in wmatrix()
e(vce) vcetype specified in vce()
e(vcetype) title used to label Std. Err.
e(params) parameter names
e(sexp_i) substitutable expression for equation i
e(evalprog) moment-evaluator program
e(evalopts) options passed to moment-evaluator program
e(nocommonesample) nocommonesample, if specified
e(technique) optimization technique
e(properties) b V
e(estat_cmd) program used to implement estat
e(predict) program used to implement predict
e(marginsok) predictions allowed by margins
e(marginsnotok) predictions disallowed by margins
e(marginsprop) signals to the margins command
e(asbalanced) factor variables fvset as asbalanced
e(asobserved) factor variables fvset as asobserved
Matrices
e(b) coefficient vector
e(init) initial values of the estimators
e(Wuser) user-supplied initial weight matrix
e(W) weight matrix used for final round of estimation
e(S) moment covariance matrix used in robust VCE
computations
e(G) averages of derivatives of moment conditions
e(N_byequation) number of observations per equation, if
nocommonesample specified
e(V) variance-covariance matrix
e(V_modelbased) model-based variance
Functions
e(sample) marks estimation sample
References
Greene, W. H. 2018. Econometric Analysis. 8th ed. New York: Pearson.
Hall, A. R. 2005. Generalized Method of Moments. Oxford: Oxford
University Press.
Hamilton, J. D. 1994. Time Series Analysis. Princeton: Princeton
University Press.
Newey, W. K., and K. D. West. 1994 Automatic lag selection in covariance
matrix estimation. Review of Economic Studies 61: 631-653.