I. BASIC
clear
set memory 80m
cd c:
cd \work\stata
insheet using water.txt
save water.dta
* use water.dta
log using water, replace
summarize _all
describe _all
Memory
C:\stata\wstata /k5000 set matsize 100
C:\stata\wstata /k5000 run
c:\data\profile.do
Data files
Infile x1 x2 x3 using test.txt
* only
text file
Insheet x1 x2 x3 using text.txt
* if
saved by spreadsheet
save
test, replace
save
test, append
use
test
list
describe
Log File
Log using test.log
Log using test.log, replace
Log using test.log, append
Log close
Log using test.log, noproc
Break
Ctrl-K
Ctrl-break
Regression
Regress y x1 x2
Predict yhat
Regress y x1 x2, robust
vce
* variance-covariance
vce, corr
matrix v = get(vce)
coeff & pred
gent asif = _b[const] + _b[ed]*ed +
_b[tenure]*tenure
test
regress y x1 x2
test x1 = x2
* b1 = b2
joint restrictions
test 2*(x1+x2) = 3*x3
test x4+x5 = 0, accum
* two joint restrictions
lr test
regress y x1 x2
lrtest, saving(0)
regress y x1 x2 x3
lrtest
non-linear restrictions
regress y x1 x2 x3
eq one:
3*_b[x2]^2 = _b[x3]
eq two:
_b[x3] / _b[x2] = 2
testnl
one two
By region: regress y x1 x2
By foreign: regress y x1 x2
Graph y x1 x2 if foreign ==0, correct(.1)
symbol(oi)
Graph y x1 x2 if foreign ==1, correct(.1)
symbol(oi)
t-test
ttest mpg, by(foreign)
* Ho: diff = 0 where foreign is
a dummy variable
Cii 97 24 6
* n=97
mean=24
std=6 95
c.i.
ttest
97 24 6 22
* test Ho: mu = 22
List
List x1 if x2 > 20
List x1 – x5
List x1 x2 if x4 > 10 | (x5>3 & x6 >
10)
* ~
= not equal & and | or
~ not
>=
greater than or equal
Sort
Sort mpg
Creating new
variables
gen lx1 = ln(x1)
* if same variable is uses, use “replace”.
replace x1 = x1 / 1000
Gen
x3 = 1.05 * x1 if foreign == 0
Replace x3 = 1.20 * x1 if foreign == 1
Clear
Clear
Drop _all
More
Set more off
Set more on
Descriptive
statistics
Summarize
Sum if
mpg > 20
Sum if foreign == 0
Sum x1,
detail
By region: summarize x1 x2
Count
Count if x == 1
Count if y = float(1.1)
* precision issue
Tabulate
Tab foreign
Tab x2 foreign
Tab x2 foreign, chi2
* Pearson chi-square test (df=n-1)
Correlate
Corr x1 x2
Corr x1 x2 if foreign == 0
Graph
Graph x1 x2
Sort foreign
Graph x1 x2, by(foreign) total
* three graphs; 0, 1, total
Tutorial
Tutorial intro
Tutorial graphics
Tutorial survival
Tutorial logit
Long Line
* semi-colon should be used.
#delimit;
summarize x1 x2
if foreign
== 1;
gen x3 = x1 + x2;
#delimit cr
Do file
Do myjob
Do myjob.do
Do myjob, nostop
* don’t stop even with errors
Batch Jobs
* at DOS
c:\stata\wstata /b do bigjob.do
ADO files
Which fit
Type c:\stata\ado\f\fit.ado
Type c:\stata\ado\f\fit.hlp
Three places to put
Official
C:\stata\ado
Personal
C:\ado
Current
.
Global S_ADO
“C\stata\ado;d:\ado;.”
* to refine paths
macro list S_ADO
CD
Cd d:
Cd \work\data
Cd “\work\detailed data”
Lags and Leads
Gen xlag1 = x[_n-1]
Gen xlead1 = x[_n+1]
Procedures (Program)
Program define hello
Display “hi there”
End
Do hello
Score
Probit y x1, x2, score(u)
* will be stored in U
Poisson Regression (Example provided by Todd)
#delimit ;
* Poisson regression (Ex. 5.3, Greene, p. 208);
* For Junsoo Lee;
input id y x ;
1
6 1.5;
2
7
1.8;
3
4
1.8;
4
10
2.0;
5
10
1.3;
6
6
1.6;
7
4
1.2;
8
7
1.9;
9
2
1.8;
10
3
1.0;
11
6
1.4;
12
5
0.5;
13
3
0.8;
14
3
1.1;
15
4
0.7;
end;
list;
* Poisson regression;
poisson y x ;
Poisson MLE (Example provided by David/Todd)
clear
insheet using c:\temp\poisson_data.txt
log using c:\temp\poisson_output.log, replace
/* this is the "canned" routine that estimates the
poisson regression */
poisson y x
/* this maximizes lnL directly, using logged factorial
of y */
program define poisreg1
args lnf theta
quietly replace `lnf' = -exp(`theta') +
$ML_y1*(`theta') - lnfact($ML_y1)
end
ml model lf poisreg1 (y=x)
ml maximize
/* this maximizes lnL directly, using the logged gamma
function */
program define poisreg2
version 6
args lnf theta
quietly
replace `lnf' = -exp(`theta') + $ML_y1*(`theta') - lngamma($ML_y1 +
1)
end
ml model lf poisreg2 (y=x)
ml maximize
Quick Panel Estimation
clear
set memory 40m
set more off
set matsize 350
log using panel.log, replace
use panel.dta, clear
tsset state year
regress y
x1 x2 state2-state51 yr82-yr95
xtivreg
y l1.y x1 x2 yr82-yr95 (l.y = l2.y),
i(state) fe
xtivreg
y l1.y x1 x2 yr82-yr95 (l.y = l2.y),
i(state) fd
xtivreg
y l1.y x1 x2 yr82-yr95 (l.y = l2.y),
i(state) re ec2sls
xtabond
y x1 x2 yr82-yr95,
lags(1)
xtabond
y x1 x2 yr82-yr95, lags(1)
twostep
log close
On-line Help
H weibull
Help for ^brier^
2. Panel Data Models (I)
*
**********************************************
* Summary Note by Jing Li and Junsoo Lee
* Do file: panel_1.do Output file:
panel_1.log
*
* Commands: xt, xtdata, xtdes, xtsum, xttab,
* xtgls, xtreg, stegar, xtivreg, xtabond
*
* September 2003
* ***********************************************
clear
cd "c:\upcd1\work\stata"
log using panel_1.log, replace
set mem 200m
set more off
set matsize 800
*****************************************
* xt *
*****************************************
use abdata.dta, clear
* use http://www.stata-press.com/data/r8/abdata, clear
* use http://www.stata-press.com/data/r8/nlswork, clear
* use http://www.stata-press.com/data/r8/union, clear
* tsset id year
** Some commands such as "xtabond" require tsset.
* iis id, clear
* tis year, clear
** iis and tis are alternatives to i() and t() option.
** These override previous setting specified by iis or tis.
** describe pattern of the panel-data
list in 1/6, separator(0) divider
xtdes, patterns(15) i(id) t(year)
*****************************************
* xtdata *
*****************************************
use nlswork.dta, clear
* use http://www.stata-press.com/data/r8/nlswork, clear
generate age2 = age^2
generate ttl_exp2 = ttl_exp^2
generate byte black = race==2
xtdata ln_w grade age* ttl_exp* tenure* black not_smsa south, be
clear i(id)
** xtdata converts the data into a form suitable for between
estimation.
regress ln_w grade age* ttl_exp* tenure* black not_smsa south
** Thus, this gives the be estimator.
* xtdata ln_w grade age* ttl_exp* tenure* black not_smsa south, fe
clear i(id)
* regress ln_w grade age* ttl_exp* tenure* black not_smsa south
*****************************************
* xtdes *
*****************************************
use nlswork.dta, clear
* use http://www.stata-press.com/data/r8/nlswork, clear
xtdes, patterns(15) i(id) t(year)
*****************************************
* xtsum *
*****************************************
xtsum wks_work
xtsum birth_yr
** As this is time invariant, its within std dev is zero.
*****************************************
* xttab *
*****************************************
xttab wks_work
xttab birth_yr
** As this is time invariant, its within percentage is 100.
*****************************************
* xtgls *
*****************************************
** xtgls fits "Cross-sectional time series" linear models using
feasible GLS (not panel estimation).
use abdata.dta, clear
* use http://www.stata-press.com/data/r8/abdata, clear
** estimate the model using GLS
* Dep var = n (log of employment in firm i and time t)
* Regressors = w (log of wage) k (log of capital stock) ys (log of
industry output)
xtgls n w k ys, i(id) t(year) nmk
** Estimating the model using default options (homosekdasticity, no
autocorrelation)
** xtgls n w k ys, i(id) t(year) igls panels(correlated)
** MLE estimation of by specifying the igls option, which iterates
the GLS estimates.
** The above does not work, since the panel should be balanced.
** We now use a different data set, which is a balanced panel.
use invest2.dta, clear
* use http://www.stata-press.com/data/r8/invest2, clear
xtgls invest market stock, i(company) panels(iid) corr(independent)
nmk
** same as regress (iid, homoskedasticity, no autocorrelation)
** nmk specifies std error to be normalized by n-k.
xtgls invest market stock, i(company) panels(hetero)
** iid, heteroskedasticity, no autocorrelation
xtgls invest market stock, i(company) t(time) panels(correlated)
** correlated, heteroskedasticity, no autocorrelation
xtgls invest market stock, i(company) t(time) panels(correlated)
igls nolog
** correlated, heteroskedasticity, no autocorrelation
** MLE estimation by iterative GLS (1046 iterations for this case.)
xtgls invest market stock, i(company) panels(hetero) corr(ar1)
** iid, heteroskedasticity, common ar1 autocorrelation
xtgls invest market stock, i(company) panels(hetero) corr(psar1)
** iid, heteroskedasticity, hetero ar1 autocorrelation
xtgls invest market stock, i(company) t(time) panels(correlated)
corr(psar1)
** correlated, heteroskedasticity, hetero ar1 autocorrelation
matrix list e(Sigma)
** Estimated cross-sectional covariances
predict new_inv1, xb
list new_inv1
*****************************************
* xtreg *
*****************************************
use abdata.dta, clear
* use http://www.stata-press.com/data/r8/abdata, clear
** estimate GLS random-effects model
xtreg n w k ys, re i(id) theta
xttest0
** Breusch and Pagan LM test for random effects, modified by Baltagi
and Li (1990; see manual, p. 210)
xthausman
** Performs the Hausman specification test for RE versus FE.
xtreg n w k ys, re i(id)
** RE GLS
xtreg n w k ys, mle i(id) nolog
** estimate ML RE model (supressing iterations with nolog)
xtreg n w k ys, re i(id) sa
** RE: using the small-sample Swamy-Arora estimator by Baltagi and
Chang (1994; see manual, p. 209)
xtreg n w k ys, pa i(id) nolog
** GEE population-averaged model; equivalent to the RE
** also equivalent to the following xtgee
xtgee n w k ys, family(gaussian) link(id) corr(exchangeable)
xtreg n w k ys, re i(id)
** Between estimator
xtreg n w k ys, be i(id) wls
** Between estimator
** (wls is used for unbalanced panel, and a stabilized variance is
used.)
xtreg n w k ys, fe i(id)
** Estimating the Fixed-effects model
*****************************************
* xtregar *
*****************************************
** FE and RE with AR(1) error
use grunfeld.dta, clear
* use http://www.stata-press.com/data/r8/grunfeld, clear
tsset
* tsset company year
xtregar invest mvalue kstock, fe
** Estimating the Fixed-effects model with ar(1) error
xtregar invest mvalue kstock, re
** Estimating the Fixed-effects model with ar(1) error
*****************************************
* xtivreg *
*****************************************
** Estimating instrumental variable panel data models
use abdata.dta, clear
* use http://www.stata-press.com/data/r8/abdata, clear
tsset id year
xtivreg n l2.n l(0/1).w l(0/2).(k ys) yr1977-yr1984 (l.n = l3.n),
i(id) fd
** FD model
** dep = n
** ind =
** l.n = n(t-1) ... endogenous and instrumented
** l2.n = n(t-2) .. L2D
** l(0/1).w = w(t), w(t-1) .. D1 (level), LD (lagged)
** l(0.2).(k ys) = k(t), k(t-1), k(t-2); ys(t), ys(t-1), ys(t-2) ..
D1, LD, L2D
** iv = l3.n = n(t-3) & all other exogenous variables
xtivreg n l2.n l(0/1).w l(0/2).(k ys) yr1977-yr1984 (l.n = l3.n),
i(id) fd first small
xtivreg n w yr1977-yr1984 (k = ys), fe i(id)
** Fixed-effects model
xtivreg n w yr1977-yr1984 (k = ys), fe i(id) first
** Fixed-effects model, reporting the first stage result.
xtivreg n w (k = ys), be i(id) first
** Between-effects model
xtivreg n w (k = ys), re nosa i(id) first theta
** GLS Random-effects model
xtivreg n w (k = ys), re ec2sls i(id) first theta
** EC2SLS Random-effects model
*****************************************
* xtabond *
*****************************************
** Arellano-Bond estimator
use abdata.dta, clear
* use http://www.stata-press.com/data/r8/abdata, clear
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(2)
** One step estimator
** Sargan's test of over-identification restriction test >> p-value
< 0.001.
** Sargan's test assumes homoskedasticity.
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(1) robust
** Still, one step estimator but reporting robust std error.
** The absence of AR(1) error is rejected but no AR(2) error is not
rejected.
** The AR(1) error does not mean the one-step estimator is
inconsistent.
** But, if the null of no AR(2) error is not rejected, the one step
estimator is inconsistent, which is not the case here.
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(2) small
** request t-stat and F-stat be reported instead of Z-stat and
chi-square stat.
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(2) twostep
** The std errors of the two-step estimator tend to be biased in
small samples.
** Thus, the one-step estimator is recommended for inference, and
the Sargan test from the two step estimator is used for model
specification.
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(2) twostep pre(w,
lag(1,.)) pre(k, lag(2,.))
** predetermined regressors
xtabond n l(0/1).w l(0/2).(k ys) yr1977-yr1984, lag(2) twostep pre(w,
lag(1,.) endog) pre(k, lag(2,.) endog)
** predetermined plus contemporaneously correlated with error
*****************************************
* More examples by Jing *
*****************************************
**** Note: try xt series of commands on "invest2.dta"
use invest2.dta,clear
* use http://www.stata-press.com/data/r8/invest2, clear
iis company
tis time
** describe pattern of the panel-data
xtdes, patterns(20)
** estimate the model using GLS
* Dep variable = invest
* Regressors = market stock
xtgls invest market stock, nmk panels(iid) corr(independent)
xtgls invest market stock, panels(hetero)
xtgls invest market stock, panels(correlated) corr(ar1)
xtgls invest market stock, panels(correlated) corr(psar1)
xtgls invest market stock, igls
gen lninvest = log(invest) /*try GLS with the log-level data*/
xtgls lninvest market stock
**** Note: try xt series of commands on "nlswork.dta"
use nlswork.dta,clear
* use http://www.stata-press.com/data/r8/nlswork, clear
iis idcode
tis year
** describe the patterns of the data
xtdes, patterns(30)
** estimate the model using 'xtreg'
* Dep variable = ln_wage
* Regressors = grade race age ttl_exp tenure not_smsa south
* And the square terms of age ttl_exp tenure are also included
gen age2 = age^2
gen ttl_exp2 = ttl_exp^2
gen tenure2 = tenure^2
* between-effects model
xtreg ln_wage grade race age age2 ttl_exp ttl_exp2 tenure tenure2
not_smsa south, be wls
* fixed-effects model
xtreg ln_wage grade race age age2 ttl_exp ttl_exp2 tenure tenure2
not_smsa south, fe
* GLS Random-effects model
xtreg ln_wage grade race age age2 ttl_exp ttl_exp2 tenure tenure2
not_smsa south, re sa theta
estimates store est1
xtreg ln_wage grade race age ttl_exp tenure not_smsa south, re sa
theta
estimates store est2
hausman est1 est2
** instrumental variable and 2SLS estimation of the data
* GLS Random-effects model
xtivreg ln_wage age* not_smsa race (tenure = union south race), re
theta first
xtivreg ln_wage age* not_smsa race (tenure = union south race),
ec2sls theta small
3. Panel Data Models (II)
* **********************************************
* Summary Note by Jing Li and Junsoo Lee
* panel_2.do
*
* Commands: xtdata (II), xtcloglog, xtgee,
* xtlogit, xtprobit, xtsum & xttab,
* xttobit, xtpcse, xtregar, xtintreg,
* xtrchh, xtfrontier, xthtaylor
* September 2003
* ***********************************************
clear
set mem 200m
cd "C:\UpCD1\WORK\Stata\"
log using panel_2.log, replace
set more off
set matsize 800
*********************************
* xtdata *
*********************************
use xtdatasmpl.dta,clear
* use http://www.stata-press.com/data/r8/xtdatasmpl, clear
** 1. use "xtdata" to convert the data into a form suitable for
between estimation
xtdata ln_w grade age* ttl_exp* tenure* black not_smsa south, be
clear
regress ln_w grade age* ttl_exp* tenure* black not_smsa south
* compare the above results to those from using "xtreg, be"
xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, be
** use "xtdata" to convert the data into a form suitable for fixed-effects(within)
estimation
use xtdatasmpl.dta,clear
xtdata ln_w grade age* ttl_exp* tenure* black not_smsa south, fe
i(idcode) clear
regress ln_w grade age* ttl_exp* tenure* black not_smsa south
* compare the above results to those from using "xtreg, fe"
xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, fe
i(idcode)
** use "xtdata" to convert the data into a form suitable for
random-effects estimation
use xtdatasmpl.dta,clear
** ratio is specified to be 1; this is for specification-search
purposes only
xtdata ln_w grade age* ttl_exp* tenure* black not_smsa south, re
ratio(1) clear
regress ln_w grade age* ttl_exp* tenure* black not_smsa south
constant, nocons
* compare the above results to those from using "xtreg, re"
xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, re
** note: every time before using "xtdata", you have to use the
original data.
*********************************
* xtcloglog *
*********************************
** 2. try the command 'xtcloglog'
*webuse union.dta,clear
*save union.dta
use union.dta, clear
* use http://www.stata-press.com/data/r8/union, clear
iis idcode
tis year
* There is no FE version of this model; a conditional likelihood
function cannot be defined.
** random-effects model
xtcloglog union age grade not_smsa south southXt, re
xtcloglog union age grade not_smsa south southXt, re quad(20)
** population-averaged model (xtgee)
xtcloglog union age grade not_smsa south southXt, pa
** population-averaged model with robust variance, clustering on 'i'
xtcloglog union age grade not_smsa south southXt, pa i(idcode)
robust
** population-averaged model with 'xtgee' options
xtcloglog union age grade not_smsa south southXt, pa
corr(exchangeable)
*********************************
* xtgee *
*********************************
* Population Averaged model (generalized linear model or Generalized
Estimating Equations (GEEs))
* g[E(y(it)] = X(it)*b with y ~ specific dist.
*
* e.g., If logit[E(y(it)] = X(it)*b with y ~ Bernoulli, it's a logit
model.
* Then, use link(logit), family(binomial)
*
* There is no convenient likelihood function. (Need to read more
references.)
*
* This procedure allows to specify the within-group correlation
structure for the panels.
* default: equal-correlation, corr(exchangeable)
* corr(ar1) can be estimated. No option for psar1.
* "xtcorr" gives the within=group correlations.
* Note : xtgls can allow for cross-sectional correlation across
panels, but this option is not
* available in xtgee. Instead, xtgls does not allow for the
within-group correlation (except
* for autocorrelation with ar1 or psar1), but xtgee can allow for
it.
* Special cases (with balanced panels): Try these.. I have not
compared them yet.
*
* xtgee, corr(independent) link(cloglog) => cloglog or xtcloglog
* xtgee, corr(independent) link(probit) => probit (but std errors
are different)
* If the binomial denominator is not 1, it's bprobit.
* Further Note: blogit and bprobit produce maximum-likelihood logit
and probit estimates on grouped ("blocked") data;
* glogit and gprobit produce weighted least-squares estimates.
* xtgee with negative binomial (nbinomial) produces estimates
conditional on alpha (correlation).
* nbreg gives unconditional estimates.
* xtgee with corr(independent) fits exponential regression (as in
survival models) but
* not with censored data.
*
* xtgee, fam(gauss) link(iden) corr(exch) => xtreg, re or xtreg, mle
use union.dta,clear
* use http://www.stata-press.com/data/r8/union, clear
iis idcode
tis year
xtgee union age grade not_smsa south southXt, family(gamma) link(log)
corr(exchangeable) robust
xtgee union age grade not_smsa south southXt, family(poisson)
link(log) corr(unstructured)
xtgee union age grade not_smsa south southXt, family(poisson)
link(identity) corr(unstructured)
use nlswork2.dta, clear
* use http://www.stata-press.com/data/r8/nlswork2, clear
*webuse nlswork2.dta,clear
*save nlswork2.dta
iis idcode
tis year
gen age2 = age*age
gen ttl_exp2 = ttl_exp*ttl_exp
gen tenure2 = tenure^2
** compare the results from 'regress' and 'xtgee' (using OLS)
regress ln_w grade age* ttl_exp* tenure*
xtgee ln_w grade age* ttl_exp* tenure*, corr(indep) nmp
xtgee ln_w grade age* ttl_exp* tenure*, corr(ar1) nmp
xtgee ln_w grade age* ttl_exp* tenure*, fam(gamm) corr(indep) nmp
xtgee ln_w grade age* ttl_exp* tenure*, fam(gamm) corr(ar2)
xtgee ln_w grade age* ttl_exp* tenure*, fam(poisson) link(log)
corr(unstructured)
xtgee ln_w grade age* ttl_exp* tenure*, fam(poisson) link(log)
corr(stationary 2)
use airacc.dta, clear
* use http://www.stata-press.com/data/r8/airacc, clear
*webuse airacc.dta,clear
*save airacc.dta
iis(airline)
tis(time)
gen lnpm = ln(pmiles)
xtgee i_cnt inprog, family(poisson) eform offset(lnpm)
xtgee i_cnt inprog, family(gauss) corr(exchangeable) eform
offset(lnpm)
xtgee i_cnt inprog, family(binomial) link(identity) corr(independent)
eform offset(lnpm)
xtgee i_cnt inprog, family(igaussian) link(log) corr(unstructured)
** xtgee i_cnt inprog, family(binomial) link(logit)
corr(exchangeable) /* this line does not work,
error message: estimates diverging (missing predictions)*/
xtgee i_cnt inprog, family(gamma) link(reciprocal) corr(independent)
xtgee i_cnt inprog, family(gauss) link(identity) corr(independent)
rgf trace robust score(newscore1)
xtgee i_cnt inprog, family(gauss) link(power) robust
xtgee i_cnt inprog, family(gauss) link(power) t(time)
corr(stationary 2) robust
*********************************
* xtlogit *
*********************************
use union.dta, clear
* use http://www.stata-press.com/data/r8/union, clear
iis idcode
tis year
** random-effects model
xtlogit union age grade not_smsa south southXt, re
quadchk
* # of points to use in the quadrature approximation of the integral
(this checkup is important.)
xtlogit union age grade not_smsa south southXt, re offset(age)
* the coeff of age = 1 (restricted)
** conditional fixed-effects model
xtlogit union age grade not_smsa south southXt, fe nolog
xtlogit union age grade not_smsa south southXt, fe noskip
xtlogit union age grade not_smsa south southXt, fe offset(grade)
nolog
** population-averaged model
xtlogit union age grade not_smsa south southXt, pa eform
xtlogit union age grade not_smsa south southXt, pa robust
* Huber & White sandwich estimator of variance
xtlogit union age grade not_smsa south southXt, pa offset(grade)
eform
xtlogit union age grade not_smsa south southXt, pa offset(grade)
robust
xtlogit union age grade not_smsa south southXt, pa nolog or robust
/* "or" the estimated coefficients are transformed to odds ratios:
i.e., exp(b) is reported. */
xtlogit union age grade not_smsa south southXt, pa nolog robust
** compare the results to 'xtgee'
xtgee union age grade not_smsa south southXt, nolog robust
family(binomial) link(logit) corr(exchangeable)
*********************************
* xtprobit *
*********************************
* There is no FE model for this. One may ne tempted to use probit
using dummy variables,
* but the resulting estimator is biased.
use union.dta,clear
* use http://www.stata-press.com/data/r8/union, clear
iis idcode
tis year
** random-effects model
xtprobit union age grade not_smsa south southXt, re nolog
quadchk
* # of points to use in the quadrature approximation of the integral
(this checkup is important.)
xtprobit union age grade not_smsa south southXt in 1/25000, re
offset(age)
xtprobit union age grade not_smsa south southXt, re offset(grade)
nolog
** population-averaged model
xtprobit union age grade not_smsa south southXt, pa
xtprobit union age grade not_smsa south southXt, pa eform
xtprobit union age grade not_smsa south southXt, pa robust
xtprobit union age grade not_smsa south southXt, pa robust nolog /*
first use 'xtprobit' */
** compare the results to 'xtgee'
xtgee union age grade not_smsa south southXt, family(binomial)
link(probit) corr(exchangeable) robust nolog
*webuse chicken.dta,clear
*save chicken.dta
use chicken.dta,clear
* use http://www.stata-press.com/data/r8/chicken, clear
iis(person)
** random-effects model
xtprobit complain age grade south tenure gender race income genderm
burger chicken, re nolog
xtprobit complain age grade south tenure gender race income genderm
burger chicken, re
** population-averaged model
xtprobit complain age grade south tenure gender race income genderm
burger chicken, pa
xtprobit complain age grade south tenure gender race income genderm
burger chicken, pa eform
xtprobit complain age grade south tenure gender race income genderm
burger chicken, pa robust
****************************
* xtsum & xttab *
****************************
use nlswork.dta, clear
iis idcode
tis year
xtsum age grade ttl_exp hours ln_wage
xttab union
xttrans union
****************************
* xttobit *
****************************
* Again, no FE version in stata, as there is no conditional
likelihood function.
* Honore(1992)'s semi-parametric FE Tobit version can be considered,
but
* unconditional tobit FE with dummies is biased.
* ll (lower limit) and ul (upper limit)
* option "tobit" reports the LR stat. versus pooling tobit.
*webuse nlswork.dta,clear
*save nlswork.dta
use nlswork.dta, clear
* use http://www.stata-press.com/data/r8/chicken, clear
iis idcode
tis year
** random-effects model (censoring point is ln_wage no greater than
1.9)
xttobit ln_wage union age grade not_smsa south occ_code, ul(1.9)
tobit
quadchk, nooutput
** random-effects model (censoring point is ln_wage lies between 0.9
and 1.9)
xttobit ln_wage union age grade not_smsa south occ_code, ll(0.9)
ul(1.9) tobit
quadchk, nooutput
** random-effects model (quadrature approx. of the integral is at
its max, i.e. 30)
xttobit ln_wage union age grade not_smsa south occ_code, ll(0.9)
ul(1.9) quad(30) tobit
** random-effects model (the coefficient of tenure constrained to be
1)
xttobit ln_wage union age grade tenure ttl_exp race not_smsa south
occ_code, ll(0.4) offset(tenure) tobit
** random-effects model (the coefficient of ttl_exp constrained to
be 1)
xttobit ln_wage union age grade tenure ttl_exp race not_smsa south
occ_code, ul(1.6) offset(ttl_exp) tobit
****************************
* xtpcse *
****************************
* Alternative to xlgls
* Panel-corrected std error (PCSE) when OLS or Prais-Winsten
regression was used.
* The disturbances are assumed to be heteroskedastic and
contemporaneously correlated across panels.
* Also, options include corr(indep), corr(ar1), and corr(psar1),
which has panel specific ar(1) errors.
* Consistent as T goes infinity.
* Again, this does not include within-group correlations; for this,
use xtgee (Consistent as N goes infinity).
use grunfeld.dta,clear
* use http://www.stata-press.com/data/r8/grunfeld, clear
tsset company year, yearly
xtpcse invest mvalue kstock
xtpcse invest mvalue kstock, correlation(ar1)
xtpcse invest mvalue kstock, correlation(psar1) rhotype(tscorr)
detail
****************************
* xtregar *
****************************
* FE and RE models with AR(1) error (common rho only). tsset is
needed due to T asymptotics.
* Can accomodate unbalanced panerls.
* "lbi" option reports the LBI statistic for rho = 0.
use grunfeld.dta,clear
* use http://www.stata-press.com/data/r8/grunfeld, clear
tsset company year, yearly
** fixed-effects with an AR(1) disturbance
xtregar invest mvalue kstock, fe rhotype(tscorr)
xtregar invest mvalue kstock, fe rhotype(tscorr) twostep
xtregar invest mvalue kstock if year != 1943 & year != 1944, fe lbi
** random-effects with an AR(1) disturbance
xtregar invest mvalue kstock, re rhotype(tscorr)
xtregar invest mvalue kstock if year != 1943 & year != 1944, re lbi
****************************
* xtintreg *
****************************
* RE models for interval data panels (no FE version)
* needs Dep(lower) and Dep(upper), and RE version requires a quadchk
checkup.
* Prediction can be given over intervals.
* pr0(a,b) computes P(20 < y < 30).
* intreg computes the LR test for the OLS
*webuse nlswork3.dta,clear
*save nlswork3.dta
use nlswork3.dta, clear
* use http://www.stata-press.com/data/r8/grunfeld, clear
iis idcode
tis year
** random-effects interval data regression model
xtintreg ln_wage1 ln_wage2 union age grade not_smsa south southXt
occ_code, noskip intreg
predict new_var, pr0(1,3)
predict new_var1, pr0(0,3)
** the coefficient of age constrained to be 1
*xtintreg ln_wage1 ln_wage2 union age grade not_smsa south southXt
occ_code, quad(25) offset(age) intreg
*xtintreg ln_wage1 ln_wage2 union age grade tenure ttl_exp not_smsa
south, offset(tenure) intreg
*xtintreg ln_wage1 ln_wage2 union age grade not_smsa south southXt
occ_code, offset(grade) quad(20) intreg
****************************
* xtpoisson *
****************************
* Three models: FE, RE and GEE
* Note that there is no prediction for the FE model: conditional
likelihood function.
* "irr" reports exp(b), which implies incidence-rate ratios
*webuse ships.dta,clear
*save ships.dta
use ships.dta,clear
* use http://www.stata-press.com/data/r8/ships, clear
** random-effects model
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, re i(ship)
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, re i(ship)
irr
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, re i(ship)
exposure(service) irr
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, re i(ship)
ex(service) irr normal nolog
* RE has a normal distribution, rather than a gamma dist.
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, re i(ship)
ex(service) irr normal quad(25)
** conditional fixed-effects model
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, fe i(ship)
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, fe i(ship)
ex(service)
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, fe i(ship)
ex(service) irr
** population-averaged model ('eform' is an xtgee option)
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, pa i(ship)
ex(service) robust
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, pa i(ship)
ex(service) eform
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79, pa i(ship)
ex(service) robust eform
****************************
* xtnbreg *
****************************
* Negative binomial Poisson models (RE, GEE and FE versions)
* Again, no prediction for the FE version
use airacc.dta,clear
* use http://www.stata-press.com/data/r8/airacc, clear
iis airline
tis time
** random-effects model
xtnbreg i_cnt inprog, re exposure(pmiles) irr
predict new_var2
** conditional fixed-effects model
xtnbreg i_cnt inprog, fe exposure(pmiles) irr
predict new_var3
** population-averaged model ('eform' is an xtgee option)
xtnbreg i_cnt inprog, pa exposure(pmiles) robust eform
****************************
* xtrchh *
****************************
* Hildreth-Houck random coefficient model
use invest2.dta,clear
* use http://www.stata-press.com/data/r8/invest2, clear
* Check the data for possible random coefficients
reshape wide invest market stock, i(time) j(company)
sureg (invest1 market1 stock1) (invest2 market2 stock2) (invest3
market3 stock3) (invest4 market4 stock4) (invest5 market5 stock5)
use invest2.dta,clear
xtrchh invest market stock, i(company) t(time)
predict new4, xb
****************************
* xtfrontier *
****************************
*Frontier Models
*Battese-Coelli (1992) parameterization fof time effects multipleid
by the inefficienty term.
*webuse xtfrontier1.dta,clear
*save xtfrontier1.dta
use xtfrontier1.dta,clear
* use http://www.stata-press.com/data/r8/xtfrontier1, clear
** time-invariant model
xtfrontier lnwidgets lnmachines lnworkers, ti i(id)
xtfrontier lnwidgets machines workers, ti i(id) nodifficult
** time-invariant model in terms of a cost function
xtfrontier lnwidgets lnmachines lnworkers, ti i(id) cost
xtfrontier lnwidgets machines workers, ti i(id) nodifficult cost
** time-invariant model with constraint
constraint define 1 _b[lnmachines] + _b[lnworkers] = 1
xtfrontier lnwidgets lnmachines lnworkers, ti i(id) constraint(1)
xtfrontier lnwidgets lnmachines lnworkers, ti i(id) constraint(1)
cost
constraint define 2 _b[lnmachines] = _b[lnworkers]
xtfrontier lnwidgets lnmachines lnworkers, ti i(id) constraint(2)
xtfrontier lnwidgets lnmachines lnworkers, ti i(id) constraint(2)
cost
** time-varying decay model
xtfrontier lnwidgets lnmachines lnworkers, tvd i(id) t(t)
** time-varying decay model in terms of a cost function
xtfrontier lnwidgets lnmachines lnworkers, tvd i(id) t(t) cost
xtfrontier lnwidgets machines workers, tvd i(id) t(t) cost
** time-varying decay model with constraint
constraint define 3 _b[lnmachines] = 2* _b[lnworkers]
xtfrontier lnwidgets lnmachines lnworkers, tvd i(id) t(t)
constraint(3)
xtfrontier lnwidgets lnmachines lnworkers, tvd i(id) t(t)
constraint(3) cost
****************************
* xthtaylor *
****************************
*webuse xthtaylor1.dta,clear
*save xthtaylor1.dta
* use http://www.stata-press.com/data/r8/xthtaylor1, clear
use xthtaylor1.dta, clear
** Hausman-taylor estimator with only endogenous variables
correlate ui z1 z2 x1a x1b x2 eit
xthtaylor yit x1a x1b x2 z1 z2, endog(x2 z2) i(id)
xthtaylor yit x1a x1b x2 z1 z2, endog(x2 z2) i(id) t(t) amacurdy
xthtaylor yit x1a x1b x2 z1 z2, endog(x2 z2) i(id) t(t) small
xthtaylor yit x1a x1b x2 z1 z2, endog(x2 z2) i(id) t(t) amacurdy
small
** Hausman-taylor estimator with constant variables
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) constant(z1 z2 ui)
i(id)
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) constant(z1 z2 ui)
i(id) t(t) amacurdy
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) constant(z1 z2 ui)
i(id) t(t) small
** Hausman-taylor estimator with varying variables
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) varying(x2 x1a x1b)
i(id)
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) varying(x2 x1a x1b)
i(id) t(t) amacurdy
xthtaylor yit x1a x1b x2 z1 z2 ui, endog(x2 z2) varying(x2 x1a x1b)
i(id) t(t) small
*webuse psidextract.dta,clear
*save psidextract.dta
* use http://www.stata-press.com/data/r8/psidextract, clear
use psidextract.dta,clear
iis id
tis t
xtsum exp exp2 wks ms union, i(id)
** Hausman-taylor estimator with only endogenous variables
correlate fem blk occ south smsa ind ed
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed)
xthtaylor lwage occ south smsa ind exp* wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) amacurdy
xthtaylor lwage occ south smsa ind exp* wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) small
xthtaylor lwage occ south smsa ind exp* wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) amacurdy small
** Hausman-taylor estimator with constant variables
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) /*
*/ constant(fem blk ed)
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) /*
*/ constant(fem blk ed) amacurdy
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) /*
*/ constant(fem blk ed) small
** Hausman-taylor estimator with varying variables
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) /*
*/ varying(ms exp* occ south smsa ind wks union)
xthtaylor lwage occ south smsa ind exp exp2 wks ms union fem blk ed,
endog(exp exp2 wks ms union ed) /* |