Вы находитесь на странице: 1из 125

Stata

E-mail: skolenikov@cefir.ru

, 20002003
c . .

In theory, theory and practice are the same. In practice, they are not.
,  ,
.

(
/ ...)

2.1
2.2

2.3

2.4

2.5


. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .
2.2.1 . . . . . . . . . . . .
2.2.2 . . . . . . . . . . . .
2.2.3 . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .
2.3.1 . . . . . . . . . . . . . . . . . . . .
2.3.2 . . . . . . . . . . . . .
2.3.3 . . . . . . . . . . . .
2.3.4 . . . . . . . . . . .
2.3.5
2.3.6 . . . . . . . . . . . . . . . .
2.3.7 . . . . . . . . . . . . . . . . . .
2.3.8 . .
. . . . . . . . . . . . . . . .
2.4.1 :
2.4.2 . . . . . . . . . . . . . . . . . . . . . .
2.4.3
2.4.4 . . . . . . . . . . . . . . . . . .
2.4.5 . . . . . . . . . .
2.4.6 . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
2

. . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

13
13
14
15

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

17
17
17
19
22
23
24
28
30
31
32
34
35
38
41
42
46

2.6

2.7

2.8

2.5.1 . . . . . . . . . . . . . .
2.5.2 . . . . . . . . . . . . . . .

. . . . . . . . . .
2.6.1 . . . . . . . . . . .
2.6.2
2.6.3 . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
2.7.1 . . . . . . . . . .
2.7.2 . . . . . . . . . . . . .
2.7.3 . . . . . . . . . . . . . . . . . .
2.7.4 . . . . . . . .
2.7.5 . . . . . . . . . . . . . . . . . . . .
2.7.6
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
2.8.1 . . . . . . . . . .
2.8.2 . . . . . . . . . . . . . . . . .
2.8.3 . . . . . . . . . . . . .

. . . . . . . .
. . . . . . . .

47
49

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

54
55
64
66
70
71
72
73
75
77

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

78
79
79
79
80

Stata

3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16

83

. . . . . . . . . . . . .
Stata . .
Stata . . . . . . . . . . .
Stata . . . . . .
. . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . .
. . . . . . . . . .

. . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . .
Internet- Stata . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

84
85
86
88
89
90
90
92
95
97
98
101
102
104
106
107

3.17
3.18
3.19
3.20

Stata
. . . . .
. . . . . . . . . . . . . . .
? . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

108
109
111
112

114

119

120

122

 , 2000 .
()


( http://www.nes.ru/english/outreach/outreach.htm ).
6- Stata  .

(, ), 2003 . Stata 7 Stata 8. ,
Special Edition, .
, .

. , , ,
, , . ,
.
:
,
1 .
Stata (StataCorp.
1 , ,
(1997); , (1998).

1999, 2001, Kolenikov 2001).


. , ,
.
,
(RLMS). , , .
,
(- : http://www.cpc.unc.edu/rlms/ ).
Stata

, Stata,
, . , ,
, ,
.
Stata .
Help/Search Help/Command whelp
, , whelp regress . , ,
 .
Stata , (. 3.1).

 ( ),
, , , ,
. ,
, , ,
, , (
, , , ,
), .
. 2
, ,
. 3 Stata , . 4
RLMS  . 5
. , , 6 ,
. .
. , , . 3, , , .
6

 2.32.4, . 47, ,
, . 2.62.8, ,
. , , , .
, Stata.
, , ,  , 3.
, , ,
,
.
3.1.
 Stata ( 3.33.6). Stata (
, , , ,
). ( ), , 3.9 (. 96). 3.20 Stata.
-, ,
, , RLMS 
.
,
, , . :
,
, ,
, , , ,
, .
, , : , ,
,
,
; ,
7

- ,
; , - , ; ,
,
, ;
LATEX; ; Stata Corporation ;
, Paragon .
   
, N HBC 807, 808.
, , , , , ( ), 1999-2003.
E-mail: skolenik@cefir.ru, skolenik@unc.edu

2

2.1

,
,
.

 , , . .
, ,
1 :
1. , .
(datadriven research). ( .
) ,
()
, , . . 
.
2. ,
1
, (1998, . 10.)

1990- . data mining (


 ,
).
, , , - .
(patterns).
. Data mining
,
.
3. . , .
,  , , ,
,
. .
( , () , ), . . .
(. . ) , , ; , ,
, , (proxy)
(,
,  , , , .. 
, ). ,
, , , .
,  , , , 

10

, , ..
4. . ,
- .
, ,
. ,  , ,
: , ,
, , , ,
..
(ARMA ).
()
.
, ,
, .
 publication bias (
), , ,
, ,
. - 
, ,
.
5. . () ,
.
; ,
. (, ,
) ,
(goodness of fit), (), ,
(.. ,
11

).
 ,
goodness of fit
(cross-validation).
 . : . Data mining
. ,
.
(out of sample prediction).
, , .. ,  .
 , .
10%, 5% 1%. (,
1973): " [ ]
".
( p-value)  () ( ,
) ,
. , H0
, .
, , ()
.

12

2.2

2.2.1

, 
,
(. 1984):
(2.1)

E[y|x] = f (x).

, (, )
() .
,
. :

yi = xTi + i ,

i = 1, . . . , n

(2.2)

yi  , xi  , xi IRp , 
, i  , i  n  .
, (2.2) :
(2.3)

y = XT + ,

y = (y1 , . . . , yN )T , = (1 , . . . , N )T , X , xi , i = 1, . . . , n, 
Xj , j = 1, . . . , p:


x11 x12 . . . x1p
1
x21 x22 . . . x2p
2


X = .
=

.
.
.
.
..
..
..
..
..

xn1 xn2 . . . xnp


= (X1 , X2 , . . . , Xp )

(2.4)

, xi1 = 1, 1  , .
13

, (2.2) ( (2.3)), ( ) :
(2.5)

Ei = 0
E2i

(2.6)

Ei j = 0 i 6= j

(2.7)

rk X = p < n

(2.8)

Xj

(2.9)

:
(2.10)

i N (0, 2 )
2.2.2

(, , , )
:

= arg min

N
X

yi xTi

i=1

2

(2.11)


(. OLS, ordinary least squares),

= (XT X)1 XT y

(2.12)

(fitted values) yi = xTi (residuals) ei = yi yi , i = 1, . . . , n.


Stata

Stata, ,
regress . regress
(. 2.4.3),
, , . ., predict  , ,  , : predict . . . , residuals , predict, . . .
xb  y . . regress
- tutorial regress .

14

-:

-
(2.2)(2.9), (2.10).
2.1 (, )

, - 2 ,

Var = 2 (XT X)1

(2.13)

:
2

1 X 2
s =
e,
n p i=1 i

(2.14)

d = s2 (XT X)1
Var

(2.15)

(
3  ,
. , -
, (2.10).
, .
2.2.3

.
,

.
, ..

H0 : C = r

vs. Ha : C 6= r,

(2.16)

2 :

A > B,

(A B)

3 , -,

15

...

C  qp ( rk C = q < p), r  q1.


, H0 q .
H0 : 2 = . . . = p = 0, ,
(.. ,  , y = y).
C = Ip1 , r = 0, q = p 1.
F -:

(SSER SSEU )/q


(C r)T (C(XT X)1 C T )1 (C r)/q
F =
=
,
SSEU /(n p)
SSEU /(n p)

(2.17)

SSER =sum of squared errors of the restricted model 


(.. , H0 ), SSEU =sum of squared errors
of the unrestricted model  .
F - () F (q, n p).
H0 :
(0)
(0)
k = k vs. Ha : k 6= k t- 4

tk =

(0)
k k
t(n p)|H0 ,
d k )1/2
Var(

(2.18)

H0 n p , d k )  (2.15).
Var(

, H0 ,
F - t- .
H0 . ,
( , ):

H0 : k = 0 vs. Ha : k 6= 0,

(2.19)

() (  observed significance, p-value)


i
h

P |k | > |k | H0 .
(2.20)
(, 10%) ,
,
4 t-

F -

16

t2 (n p) = F (1, n p)

, , H0
. , 1% , , ,
.
Stata

2.3

Stata test ,
( regress ; . 3.9).

( ) . ,
, , -,
-.
, 2.1.
2.3.1

(2.5), , ,
, , ( ) ( ).
. , (. 2.4.1)
.
2.3.2

(2.9) ,
, , ( ).
, , .. , (2.5)(2.7) x.
,

17

(2.7) 5 . (2.8)
XT X:
1
plim XT X = M > 0pp
(2.21)
n n
 ,

E[|x] 6= 0

(2.22)

,  (measurement error models), , (simultaneous equations,


. 2.8.1).
, (2.22) - (. . ).
, (. IV, instrumental variables): (), ,
, X ( ). (. 2SLS,
two-stage least squares). IV- ,
. (generalized method of
moments  GMM, (Greene 1997, Matyas 1999), 2
(Neyman, Pearson 1928)) , IV-,
.

, , , ,  . (Hausman 1978).
-, IV-
, ,  , . ( ) -, IV-, ,
5 , .
, (, , ), , .
. , (1998, . 16). ,
, (. 2.3.5)

18

.
() 2 , /
.
. ,

(. 2.7.3).
Stata

Stata, ,
ivreg . hausman , , ,
(hausman, save ), , ,
(hausman ). Stata 8 , hausman
.

.
, ,
:

yi = xi T + i
xi =

xi

+ i

(2.23)
(2.24)

xi , ( yi ) xi . ,
. ,
, i .
2.3.3

(2.6) ( ,  ) (2.7) ( ) ,
- . , ,
 -
, - . , , .. .
19

, , .

= Var

(2.25)

(. GLS, generalized least squares)


:

= (XT 1 X)1 XT 1 y

(2.26)

-
.

(2.6)(2.7),
.
2.2 ( (Aitken))

Var = (XT 1 X)1 ,

(2.27)

 

Var( ) = (XT X)1 (XT 1 X)(XT 1 X) > (XT 1 X)1

(2.28)

. , , , . , , , 2.3.4, 2.3.5
2.7.
, (2.7) ( (2.6)), ,
. , - (Goldfeld-Quandt) , - (BreuschPagan) 
, , (1997).
Stata

Stata (, Cook-Weisberg) hettest ,


regress :
(
ln e2i = z T + i
H0 : = 0

20

z
. 8- Stata
imtest .

,
: N (N21) , N .
:
= (), ()
. , (feasible
generalized least squares) ( ) : (, ,
), ,
d ), .
(, , ()

 ; ()
.

, . (White):
n

=
V ()

1X
xi xTi
n i=1

!1

1X 2 T
e xi xi
n i=1 i

1X
xi xTi
n i=1

!1

(2.29)


- (sandwich estimator), .
(Huber), 1960-.
;
.
Stata

Stata , ,
robust regress . , Stata ( , )  regress

21

[weight=exp] , . Stata (. help weights );


aweight  . ,
,  vwls .

2.3.4

,
( ).
.
Stata

Stata 6 ( ts), ..
( ) L., D.,
S.. time .

( ) - (Durbin-Watson),

D=

PN

i=2 (ei ei1 )


PN 2
i=1 ei

(2.30)

, - , 2. , 0 4, . , ,

. - , (1998).
.
Stata

Stata - dwstat ,
regress .

, , .

22

(Newey, West 1987):

k 
X

l=k

|l|
1
k+1

1X
xil xTi
n i=1

!1

1X
ei eil xil xTi
n i=1

=
)
Var(
!1

1X
xi xTil
n i=1

, (2.31)

, xi , i- .
,
k . ,
. k = 0
- (2.29).
Stata

2.3.5

Stata newey . , , tsset ,


newey, t() , .

,
(. 2.7),
, ( .. RLMS, . 4).
. ( , , , ,
; , ,
)  (, RLMS ;
). , , (  ,
, probability proportional to size, PPS sampling), (primary sampling units  PSU).
PSU
( RLMS  , ,
), ,
 , ..
, ,  
(, ).
23

, , , . ,
PSU , (
) ,
PSU , , , PSU . ,
,
,
:

yit = xTit + P SU + . . . + ui + it

(2.32)

, . , ()
,
.  
(2.29). (linearization estimator), ,

-.
Stata

2.3.6

Stata ,
 svy . ,
( svyset svydes ).

(. help weights ), (.. , )  pweight (. probability weights) 
. svy - , cluster() ,
Stata, , .. regress . .

(2.8) , ..
. , .
24

, , .
, 6 (
0/1-, , ,
, ).
Stata

( ), Stata
, , , .  ,
 , ,
( -
). Stata  xi.
,  
areg , a absorb, .. .
F-. , 
anova (. help anova , tutorial anova ), anova . . . , continuous .

. XT X
. 
, (, ) .
 , , , ,
X.
Stata

8- Stata pca .
, ,
, Stata
factor . . . , pc , pc ,
(principal components). matrix svd , . singular value decomposition.

6 .

25

 .
max /min 
XT X,  (condition number).
: 10 100 , 1000 ( ) 
.
, , (. variance inflation factor,
VIF; . Fox (1997), Smith and Young (2001)),

VIF(j ) =

1
,
1 Rj2

(2.33)

Rj2  Xj X (
Xj j - , .. j - X).
:

Var j =

1
2
1 Rj2 (n 1) Var Xj2

(2.34)

, , 7 . VIF
4 , Rj2 ' 0.75.
Stata

vif , regress .

, 0/1, 7 , ,

VIF

, : VIF , 
!

26

8 : 9 ,
, . ,
.
, - ,
, (,
70,  5,
60 80), . , ,
: Var Xj2 (2.34) , .
,
j .
.. Xj Xj = Xj X
, , , ( -). 
(
, . . ;
2.4.1)
( ,
).
. , , , :

2 + Var()

= arg min E( )2 = ( )
B

(2.35)

B  , y .
8 , , :  , ,
. -,  . , , , 
() . -, -

t-

9 , ,

27

1/2.

XT X , 
  , Ip ,
Ip  p. :
1 T
ridge = XT X + Ip
X y
(2.36)
- ( . ridge  ;  . ,
, , , ; . (1981)).
shrinkage estimator, ,
-  .
, . ,
, (2.35) , , .. ,
.
Stata

2.3.7

- rxridge , Stata, STB-28.


Stata, 6- Stata. , webseek
rxridge .

, , 
(, , (2.10)). , , (  )
, ?
, ,
. , : ,
, ,
.
,  ,
,  , , , ..
28

(, ,
) .
, 
, --
(signrank ranksum) t- .
(1984),
() (, , ), .
, (. influence function influence curve) 
.
,
(, )
.
, . , ,
,
, . ,
- y : - i- yi ,
.
, , -,

N
X

(zi ; ) min,

i=1

(2.37)

() , z 2
10 . , , (z, ) = |z|.
,
.
 (Huber)

z 2 /2,
|z| < c
Huber
c
(z) =
(2.38)
2
c|z| c /2, |z| c
10

29

z = y xT .

c > 0 , :
c , ; , , c 0,
.
(),  (Tukey):




c2 1 1 z 2 3 , |z| < c
6
c
(z) =
biweight
(2.39)
c
c2
,
|z| c
6
c  . c
.
Stata

rreg  
Stata. ,
 .

, , , - .
, : , H0 : i N (0, 2 ). , ,
.
, ,
2.4.3.
2.3.8


/ . - (Box-Cox):
(
y 1
,
6= 0
y 1
y () =
(2.40)
y ln y,
=0

Qn
1/n
y = ( i=1 yi )
 yi . 11 . ,
- , ,
11 H

, -

30

, ,
(. 2.4.2)
, , ( ), (
). 1
CV = (Var X) 2 /EX .
, ( ,
, , ,
, . .). ,
- .
, , , -
( )
, .

.
Stata

- boxcox . boxcox . . . ,
graph . predict . . . , tyhat
boxcox . . . , generate . , , ,

y () = XT + ,

(2.41)

regress .
- boxcox2 ,
STB-54.

2.4

(2.5)(2.9), (2.2) , .

31

2.4.1

, , ,
, , . , , : , . ,
,
. ,
,
(. . ).
, ,
, . , , - , , , ,
.
, , ,
, , ,
.
Stata

Stata sw
(. stepwise).
sw regress depvar varlist, ,
varlist . , , .

, (goodness of
fit), R2 : , .. 1,
R2 , . ,
- R2 , 0,
, 1, . ,
, :

R2 , , ,
32

(, , , :
).

R2 : R2 1.
-R2
1, .
R2 (goodness of fit).  (
, Granger causality test (Handbook 1983, 1984, 1986, 1994)).
R2 , , 2
Radj
, :
2
Radj

eT e/n p
=1 T
,
y y/n 1

(2.42)

e  , y  () .
, , ,
, , , .
,   ,
(overparametrization), 12 .
(AIC, Akaike information criteria):

+ 2p,
AIC = 2 ln L()

(2.43)

 ( L()
), p  . 
AIC.
12 , , ,
, . , . ., .,
Konishi and Kitagawa (1996).

33

, (Schwarz Bayesian information criterion, SBIC,


BIC), p ln n, n  :

+ p ln n,
SBIC = 2 ln L()

(2.44)

,
.
Stata

2.4.2

, Stata , . , , fittest , SSC-IDEAS


2 , (http://ideas.uqam.ca ), R2 , Radj
, ,
. , ,
, ,
web- icomp 13 .

,
E[y|x] . , . , , ,
, , t- F-.
( , ),
1960- . .

ei =

K
X

k yik + i ,

(2.45)

k=1

yi  -, ei  ,
H0 : = 0.
13 , ; ,
, , ,
.
AIC SBIC  , ,
.
, .
!

34

Stata

Stata ovtest . Stata ( K = 4)


.

, (, y = a + bx2 + , y =
a sin x + , y = axb e ,   (, ,
) . ,
.
, ..
.

yi = f (xi , ) + i ,

(2.46)

f ()  ( y = a sin(bx + c) + , y = axb + ).
, (. NLS, non-linear
least squares) , , .
Stata

2.4.3

Stata
nl. , , f () nl.

, - , : ,
, ,
? ,  ,
? ,
: (influential observations), (outliers)  , , .
, (, ), 
,
( 1997 .),
35

+: regular points

*: outlier
*

+
+
+
+

+
++
+

+
+
+

.5

1.5

2.5

. 2.1: . : y =
1x+
. . , ( )
(, ) ( ).
,
(, , (. leverage) ), i . ,
, (. 2.4.3),
.
14 .
14 , , ,
.

Draper, Smith (1998), Fox (1997), Smith and Young (2001).

36

y = X = X(XT X)1 XT y Hy

(2.47)

H X yi
Pn
y. , hii = j=1 h2ij ,
i- hi hii (. hat value,
). ,
1/n hi 1, p/n, hi  ,
3p/n.
 ,
, , . , , , , , ,
.
i- , 15 :
ei
ei = (i)
,
(2.48)
se 1 hi
(i)

se  i- ,

1 hi , Var ei |H0 = (1 hi ) 2 . ei
N p 1 .
t- y = XT + Di + i , Di 
, i- .
C   D-
(. Cook's distance):

Di =

e2i hi
p 1 hi

(2.49)

D- ,
- .   Di > N 4p .
15 , jack-knife,

. ,
, ( 1988).

37

k DF BET ASk,i :

DF BET ASk,i =

(i)
k k
,
d (i) )1/2
(Var

(2.50)

(i) , i- .
,
- t-,
.
, |DF BET Ak,i | > 2/ n p.
, :
r
hii

(2.51)
DF F IT Si = ei
1 hii
hii ,
, 1 hii . ,
, p
. DF F IT Si i- 2 p/n,
, , .
Stata

2.4.4

hat-values predict . . . , hat , regress . predict . . . ,


rstudent regress . D- predict
. . . , cooksd , DF BET A  predict . . . , dfbeta( )
dfbeta , DF F IT S  predict . . . , dfits .


. ,  .
Stata

8- Stata graph , .
. . 3.14.

38

, , .
,  ( , ), . .
Stata

summarize . , . . , graph  
hist   . ( kdensity ),
( qnorm ), (
diagplots ) ( histplot , SSE-IDEAS, : http://ideas.uqam.ca ).
,
(, ,
, )  sktest , . skewness-kurthosis test.


. . . 16
Stata

. . . predict . . . , residuals regress .

. . , , , ,
, ,
.

. 2.2.
16 , , ,

( )

(, ).

39











. 2.2: ;
; ?
, , ()
, ,
. .

y = X(k)T (k) + (k)

(2.52)

Xk = X(k)T (k) + (k) ,

(2.53)

(k) k - . (. added
variable plot) (. partial regression plot).
(
- ), ,
.
Stata

avplot . ,
, , .

40

/
(. . y,  e). ,
,
.
Stata

rvfplot  . residual versus fitted.


hettest ovtest .

( ) :
(2.54)

e(k) = e + k Xk

Stata

Stata  cprplot acprplot (. component plus residual).

, - 
.
2.4.5


F - (2.16).
()
. , , , .
, Ak , k -
(, , Ak  ), , ,
 X

P k Ak
P Ak
(2.55)
k

P (k Ak ) 1

X
k

P Ak

(2.56)

(2.56) . , ,
41

, , (2.55) 1 . ,
 , P(Ak )
/K , K  . (Bonferroni adjustment)
. ,
 (Sheffe), (Tukey)
- (Working-Hotelling) ( 1980, Smith and Young 2001).
 , Stata . , , set level . . . . 95 ().
query  . 3.15.
2.4.6

 ,
: , , , . , , ,
- (sample selection  ).
Little and Rubin (1987).

,
, . Rubin (1976). , (data are missing completely at random 
MCAR), P(Xj | X) Xj , X (
, Xj Xj , ).
(missing at random  MAR), P(Xj | X)
Xj ( X ). , (ignorable),
42

. , P(Xj | X)
Xj , (non-ignorable, not missing
at random  NMAR), . ,
. , MAR MCAR, ,
MAR NMAR.
,
. (, 15%), MCAR.
(,
), MAR. , , , , ,
.
, .


, .. ,
(complete case analysis). , ,
MCAR.  , , ,
. . .
Stata

. . . correlate . . .

,
Stata

. . . pwcorr .

(available case analysis). 


. , MCAR .

. -
.
43

, ,
.



,  (imputation):
- , , .
- ,

.
Stata

, Stata  . : impute
(, ) ( , , ,
)
.

,
MAR,
.
,   (hot deck imputation). , , ,
, : (, ),
(,
, ,  ). ,
.
.
Stata

hotdeck ,
(Mander and Clayton 1999).

,  
(multiple imputation),
44

70- Rubin (1978). ,


, , ,
.
(within variance) (between variance).
;
(, ). , MAR.
Stata

Stata, ,
, .


, . , , Y = (Ymiss , Yobs ),
Yobs  , Ymiss  , , .
,
.. , L(|Y ) = f (Y |).
,  L(|Yobs ). , Rij =
I(yij ) g(R|Y, ) 17 ,
Z
L(, |Yobs , R) = f (Yobs , Ymiss |)g(R|Yobs , Ymiss , )dYmiss
(2.57)

, , .
EM-, ()
17 ,

45

( ) . , EM-
( , , ), Dempster
et. al. (1977), Little and Rubin (1987) ,
EM- 1920- . ,
EM-  ,
, .. , .
EM- ,
. E (expectation)  
. (
, ,
, , , , ,
) ,
, E
. M (maximization)
( ),
( ), E. EM- ,
. , (, 106 ).
2.5

, "- "?  ,
. ., Stata.
, ,
,
.
Stata

Stata ,
, .
regdiag diagplots .

46

2.5.1

.
Stata

Stata
(
), , regress ,
, , .
tutorial regress tutorial aboutreg .

2.1:



H0 : Et t1 = 0

DW

0 4,

2
:
 H0 : ln i = T zi
.
: F, 2

. -
max /min  1
VIF

- VIF > 4

( VIF > 2)

47

Stata

regress
dwstat

regress
hettest
regress
avplot ;
rvfplot
factor, pc

regress
vif

RESET -


, -

,
-

(,

D
,
DF F IT S ,
DF BET A

-

-

( H0 ), ( Ha )
( ),
( H0 )



Stata

F, 2

regress
ovtest

regress
avplot ;
rvfplot ;
cprplot

0
, ;


summarize ;
sktest ;
graph
,
norm ;
kdensity ;
qnorm
regress
predict,
cooksd ;
predict,
dfit ;
predict,
dfbeta
avplot ;
rvfplot

c . .

48

hausman

2.5.2

 
18 .
1 tutorial
aboutreg. , , , , Stata
:

. use auto, clear


. regress price mpg foreign weight
Stata :
2.2: Stata
Source |
SS
df
MS
---------+-----------------------------Model |
317252881
3
105750960
Residual |
317812515
70 4540178.78
---------+-----------------------------Total |
635065396
73 8699525.97

Number of obs
F( 3,
70)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

74
23.29
0.0000
0.4996
0.4781
2130.8

-----------------------------------------------------------------------------price |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------mpg |
21.8536
74.22114
0.294
0.769
-126.1758
169.883
weight |
3.464706
.630749
5.493
0.000
2.206717
4.722695
foreign |
3673.06
683.9783
5.370
0.000
2308.909
5037.212
_cons | -5853.696
3376.987
-1.733
0.087
-12588.88
881.4931
------------------------------------------------------------------------------

 (
y , ,
, y ),  , ( , F -
2
H0 : , ; R2 Radj

18 7- Stata.
.

version 7 .

49

). , , t- H0 : k = 0
.
(, ovtest, hettest ) ,
.
, (fitted values). . 2.3 , ,
( ) ( kernreg, . 2.8.3). ,
, (2.45).
. 2.3: : , ,
. .
price
linear OLS regression

kernel regression

15000

10000

5000

0
0

5000
Fitted values / argument Xb

10000

, , (. . 2.53).
(. 2.4), , .
, .
, ,
.
50

. 2.4: weight ( avplot weight).


FRHI

 VH

 W



H SULFH _ ;






H ZHLJKW _ ;



(. 2.5) , ,
, , . , , ,
 , .. - .
, (. . 2.6
rvfplot), ,
.
,
predict , rstudent , dfbeta , dffits , cooksd hat 19 .
. 2.7 , (leverage) .
D. . 2.4.3. , , . ,
, D 19 ; . 3.1

51

. 2.5: ( rvfplot, yline(0) ).




5HVLGXDOV







)LWWHG YDOXHV



. 2.6: ( rvfplot, c(s) bands(10) d(50) ).

5HVLGXDOV






)LWWHG YDOXHV

52



(: predict . . . , cooksd
regress . . . , if . . . < . . . , . . . - ).
. 2.7: , .

&DG 6HY

&DG (OG

6WXGHQWL]HG UHVLGXDOV


R 3O\P &K
R
R

R
R
R R
R
R
R
R
R

R
R
R

R R
RR
R
RR
RR
R
R
R RR
R
R
R
R
R
RR
R
R
RR
R

R
R R
R
RR R

R
R R

9: 'LHVH

R
R

R
R

R
R

R
R






/HYHUDJH





, ,
. . 2.8
. ,
:
, .
, , Stata . , tutorial regress , tutorial aboutreg
tutorial graphics .

53

. 2.8: (1)
(qnorm . . . ).

5HVLGXDOV






2.6

,QYHUVH 1RUPDO



, , , ,
; , ; (   ). ,
, 0/1 -. , ,
 ,  ,
 ,    (  ).  (ordered) :
,
 . ,
, 
, , , ..

54

(multinomial, polytomous 20 ) .
,
, .  (, ,
).
; , , ,
, .  ,
(0, , 1,
), .
.
2.6.1

, 0 1.
,
21 ? , ,
. , 0 1;
, ,
pi (1 pi ), pi = P[yi = 1|xi ]. , -
xi [0, 1], , ( ), . .
() .

(. . 1 ):

P(y = 1|x) = F (x)


(2.58)

P(y = 0|x) = 1 F (x)

F ()  , [0, 1].
, 
, (2.2),
20 polychotomous
. dichotomy (. ), , , polytomy:
 

21

55

linear probabilty model.

F , , ,
, ( ).
(index function).
F ,

P(y = 1|x) = F (xT )


P(y = 0|x) = 1 F (xT )

(2.59)

F ,
.. 22 .
, 1,
, 0 . ,

yi = xTi + i

(2.60)

( i 
; , ,
, . . ), ,

yi =

1, yi 0
0, yi < 0

(2.61)

P(yi > 0) = P(xTi + i > 0) = P(i > xTi )

(2.62)

P(i < xTi ).


F ()  () :

(z) =

1
1 + exp(z)

(2.63)

22 , .

56

- - ;
. , , , ,
2.7. , : supx(,+) |Flogit (x) FN (0,1) (x)| < 0.02,
. - ,  ,

, .
(Heckman sample selection
model), 2.6.3. , -

(goodness of fit).
- (. . F () ) : -

/ 3 1.6 , -,
.
, (Gomperz, , /gompit):

F (z) = 1 exp[ exp(z)]


Stata

(2.64)

Stata probit , logit


cloglog .

. :

F (xTi ),
yi = 1
L(yi , xi , , F ) =
(2.65)
T
1 F (xi ), yi = 0

L(yi , xi , , F ) = F (xTi )yi (1 F (xTi ))1yi

(2.66)

( , )
:
n
X


ln L(y, X, , F ) =
yi ln F (xTi ) + (1 yi ) ln(1 F (xTi ))
(2.67)
i=1

57

.
Stata

Stata ,
23 .
ml. , Stata, ()
, , , ,
; ,
, Stata.
(. . 21), , robust , .
( ) ml [R] ml
(Gould, Sribney 1999).

- -
(, 1973).
, ,
. . . ( ..
)
,
,  (Wald test), ,
, (LM test, Lagrange multiplier test, score test), , ,
(
, ) ,
(
).
2 , (, 1998, Greene 1997).
,
23
Maddala (1983), ,
Stata.

58

R2 , - -
. ,

R =1

n
P

(yi pi )2

i=1
n
P

(2.68)

(yi y)2

i=1

pi = F (xTi )  , 1,
. ,
, . - R2 , L ,
, L . R2 ,
2

R =1

L
L

 n2

(2.69)
2/n

, ,  : 1 L ,
1. , - R2 , 

R2 =

L
L

 n2

2/n

1 L

(2.70)


, . ,
20 , :
>
>
>
>
>

clear
set obs 200
set seed 98081
g x = uniform()
g y = x<0.5

 (0.5 x).
, ,
, :
. logit y x
outcome = x <= .4998181 predicts data perfectly
r(2000);

59

Stata , ,
, .  :
> replace y = 1 - y in 1

:
. logit y x
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration

0:
1:
2:
3:
4:
5:
6:
7:
8:

log
log
log
log
log
log
log
log
log

likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood

=
=
=
=
=
=
=
=
=

-138.46939
-55.986442
-36.905004
-27.303153
-22.072819
-19.721858
-19.134351
-19.088901
-19.088548

Logit estimates

Number of obs
LR chi2(1)
Prob > chi2
Pseudo R2

Log likelihood = -19.088548

=
=
=
=

200
238.76
0.0000
0.8621

-----------------------------------------------------------------------------y |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x | -41.79944
9.568822
-4.37
0.000
-60.55399
-23.04489
_cons |
20.59037
4.741358
4.34
0.000
11.29748
29.88326
-----------------------------------------------------------------------------note: 11 failures and 10 successes completely determined.

, , Stata 8 ,
24 . 21 0 1 ( double 1016 ).
24 , , , , , ,
. , ,
. Stata , , Stata
,
.

60

 , , , , , 1/2 (, , 0.493),
F () 40!
, 0 1 .
, , (
, 2 ).
(2.12),
; ( , ),
. ,
(. . , ).
.
,
 , .
(
, , . . (score),
. .) .
Stata

probit logit predict , regress . Stata  (Nelder,


McCullagh 1989, Hardin, Hilbe 2001) glm . , - glm . . . ,
f(b) l(p) ,  glm . . . , f(b) l(l) .
[R]
glm.


. (deviance residuals):
. glm y x , f(b) l(l)
Iteration
Iteration
Iteration
Iteration
Iteration

0:
1:
2:
3:
4:

log
log
log
log
log

likelihood
likelihood
likelihood
likelihood
likelihood

=
=
=
=
=

-48.623085
-19.440911
-19.128412
-19.088587
-19.088548

61

Iteration 5:

log likelihood = -19.088548

Generalized linear models


Optimization
: ML: Newton-Raphson
Deviance
Pearson

=
=

No. of obs
Residual df
Scale parameter
(1/df) Deviance
(1/df) Pearson

38.17709647
10118.00926

Variance function: V(u) = u*(1-u)


Link function
: g(u) = ln(u/(1-u))
Standard errors : OIM

[Bernoulli]
[Logit]

Log likelihood
BIC

AIC

= -19.08854823
= -1010.889742

=
200
=
198
=
1
= .1928136
= 51.10106

.2108855

-----------------------------------------------------------------------------y |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x | -41.79945
9.569472
-4.37
0.000
-60.55527
-23.04363
_cons |
20.59037
4.741676
4.34
0.000
11.29686
29.88389
-----------------------------------------------------------------------------. predict devres, dev

, , logit, glm .
qnorm devres . 2.6.1. , . , . ( ) , 4.29  , ,
, !
-, . ,
10% , Stata,
. ,
0 1, .
 
(overdispersion). ,
( Var y = p(1p) = (1) , Var y = =
; = Ey ). ,
62

 .
. 2.6.1 , 0.01.
.
deviance residual

Inverse Normal

deviance residual

1.30795

4.2944
1.14159

1.11533
Inverse Normal

. 2.9: -: , .
 , . . , . - - (,
, ) , ,
, (.. ,
, ),
, . , -
,
,
.
Stata

- dprobit , - , probit ,

63

deviance residual

Inverse Normal

deviance residual

1.41481

4.2944
1.46656

1.41481
Inverse Normal

. 2.10: -: .
. Stata mfx ,
.

2.6.2

, () j = 1, . . . , m. (2.59),
:

pij = P(yi = j|xi ) G(xTi j ),

j = 1, . . . , m 1,
G(xT j )
pij = P(yi = j|xi ) =
,
Pm1i
1 + k=1 G(xTi k )
1
pim = P(yi = m|xi ) =
Pm1
1 + k=1 G(xTi k )

64

(2.71)
(2.72)
(2.73)

yij = I( i- j - ),

L(y|x, ) =

n Y
m
Y

pijij

(2.74)

i=1 j=1

- , G(z) = exp(z), ..,

exp(xT j )
,
Pm1 i
1 + k=1 exp(xTi k )
1
= P(yi = m|xi ) =
Pm1
1 + k=1 exp(xTi k )

pij = P(yi = j|xi ) =

(2.75)

pim

(2.76)

-, .
- , ( ),
I / .
Stata

Stata,  mlogit .
,
(, ).

xi ,
j , / .
. (McFadden 1974):
, , ,
. ,
xijk  i j k . (conditional logit
model)25 :

exp(xTij )
P(yi = j|x, ) = Pm
T
l=1 exp(xil )
Stata

(2.77)

Stata  clogit .
 , . reshape , . 93.

25 .
2000 .

65

 . .

(nested logit model), ,
, ,
1.  ;
2. : 
.
Stata

nlogit .
, [R] nlogit .

, (      ),
j F (xT ), 1 < . . . <
m1 ( 0 = m = ).

yi = xTi + i

(2.78)

j1 < yi < j , j - yi . , P(yi = j|xi ),


, , , P(yi j|xi ).
Stata

2.6.3

,  ,
- oprobit ,
 -, ologit .


,
(. 2.4.6).
66


, NMAR.
(truncation)  , ,
(, ). N (, ) c,

.  
(
y
1

c
, y<c

c (y) = (y|y < c) =


(2.79)
0,
yc
()  , ()  .
(censoring) ,
, ( ) .
 ( , , , $1000 )
(survival time data 
, , , ).

y N (, )

y , y < c
y=
,
c, y c

(2.80)
(2.81)


 c 
Y 1  yi  Y 
L(, 2 |y) =

1
(2.82)

y <c
y =c
i

.
 ,
, . .
(2.80)

y = xT + i i N (0, 2 )

y , y < c
y=
c, y c
67

(2.83)
(2.84)

. 1950- .
- (Tobit).
c = 0  , ,
(
).
Stata

Stata tobit . ll ul ,
.

yi ,  (, , ) yi (NMAR). ,

xT /
T

(2.85)
E(y|y > 0) = x +
xT /
, . . . (2.85), (
(inverse Mills ratio))  , , I(yi > 0) 
, .
1960-1970- . (
).
,  , . . i2 , .
, .
,
, (self-selection), ,
,  , ;
;
(reservation wage). ,
68

,
(
, , ).
1970- . . , . . , 2000 . ( )
.
(sample selection, self-selection),
, , ,
, w1 ,
x1 (, . .) w2 , x2  ,
, , , . ,
, y 
:

w1 = xT1 1 + 1 ,

(2.86)

xT2 2

(2.87)

w2 =
y=

+ 2 ,

w1 , w1 w2 ,
0, w1 < w2

(2.88)

, 

w1 = xT1 1 + 1 ,

(2.89)

y2 = I(xT 2 + 2 > 0),



w1 ,
y2 = 1,
y=
, y2 = 0

(2.90)
(2.91)

.
(2.86)(2.88) , 1 2 , x1 , x2 .
,
(2.90);

= (xT 2 )/(xT 2 )
69

(2.92)

(2.89).
, . ,
, ,
.
, , ,
 (
) .
Stata

heckman . Stata , , ,
y2 = 0. heckman . . . , select(
. . . ). 
; .

/ (treatment regression; , ,
):

y = xT1 1 + z + 1 ,
z=

I(xT2 2

+ 2 > 0),

(2.93)
(2.94)

(, ,
) y , x1 .
/
x2 .
Stata

2.7

Stata, (2.93)(2.94), treatreg . ,  treatreg . . . , treat( . . . ).

(panel, longitudinal, . . , , )  ,
. , (,
70

. .) (
, ).  T (, )
n (, , ). ,
(individual heterogeneity), , ,
.
:  ( )  . (Maddala
1993, Baltagi 1995). , , . . :

yit = xTit + ui + it

(2.95)

ui  , i, it 
, ui .
P

i ui = 0.
Stata

Stata xt, x, 
t.
xtreg :
 xtreg . . . , fe ,  xtreg . . . , re ,
 xtreg . . . , mle .
  . reshape , . 93.

.
2.7.1

ui ,
, ui . 
:

yit yi = (xit xi )T + it i

(2.96)

P
zi : zi = 1/ni t zit . (within estimator). ui ,
. , , n .

71

( ) 
:

yit yi,t1 = (xit xi,t1 )T + it i,t1

(2.97)

, , , . ,
, . . , ,
, .
, , ui 
i- . , , n .
( . . Stata)
, (2.96), n ( n
).
, , ,
 ,
. ,
, .
.
2.7.2

, ui  .
, ,
( , , ), ,
ui it ( ),
.
, , Var ui = u2 , Var it =
2 , , Var

yit = xTit + it

(2.98)

- , 2 Ini +
u2 Jni , . . u2 + 2 u2 . 72

. ,  (
nT , , ,
103 105 )
.
, , , . , ,
, (between estimator)

yi = xTi + ui + i

(2.99)

, , Baltagi (1995). 26 . ui it ,
, (
2 ) (
12 , 12 = 2 /T +u2 ).
2
,
u2
:

u2 =
12
2 /T

(2.100)

.
, u2 ,
( 2 /T ),
. Stata
u2 ,
, ,
(
u2  ).
  , .
2.7.3

,  , ,
. 
, ( ) . ,
26 Stata, , , .

73

.  (. . 58), ,
.
u2 u2 = 0.
Stata

- ( ) xttest0 .
( , p-value) ,
.

27 . ,
,

E [uX] = 0

(2.101)

, / .
, - , ,
.
( ).
, .
Stata

xthausman . 8-
, . hausman .
xtivreg .

, , ,
. ,  , 89 ,
. , ,
27 .
, . 18.

74

- ( , , , . . ,
 , ),
,
. , , - 28 ,

.
: . , ,
. , ,
,
, . , , , ,
- ,
(, , ,
, , ), ,
, . . .
2.7.4

, ( )
. , ,
. , , -
. ,
, ui it .
28  . . ,
, ,
.

75

T
ui , ,
ui ( )
. , . . ,
, , ui .

P
ui . -
t yit , - (2.77), ,

. , yit , .
- , -
.
ui , :

ln L =

n
X
i=1

ln

Z Y
T

F (xTit + u)yit (1 F (xTit + u)1yit dG(u)

(2.102)

i=1

G()  ui , F ()  ,
. . ( u)
, it ,
, -
.
Stata

Stata  xtlogit ( , ) xtprobit ( ). Stata , txttobit. 


: xtprobit . . . , i(id) 
probit . . . , cluster(id) .

76

2.7.5

 ( ui , it ) it . , ,
, , , . . ,
t- .
Stata

,
it xtgls , 
xtregar . xtgls
. . . , p(h) ( panels(heteroskedastic) ; Var ui = 0,
. . , Var it = i2 ) xtgls . . . , p(c) ( panels(correlated) , ). corr(ar1) ,
, corr(psar1) , . , xtgee (generalized estimating equations 
,
). ,
.

. , , , , , ,
( , ), , ,
. , :
, 10 ,
50%. RLMS ( 4) ,
- ,
-,
.

77

2.7.6

, , . -,
(  , , ,
); -, . (mixed models), y , :

y = xT + uT z

(2.103)

x  , u  ( ), z  , , , , 1. ,
u 1,
 2. z ,
.
, , , ,
.
Stata

(, ) Stata,  gllamm (Rabe-Hesketh


et. al. 2002). ( ) .
, . , (, ) Stata.
, , - , Stata. gllamm . . . , i( . . .
) . . . . gllamm . . . , family( . . . ) . . . ,
, family(gaussian) , family(binomial) , , -,
-. gllamm . . . , link( . . .
) . . . , , link(id) , link(logit) , link(probit) ,
link(mlogit) , link(ologit) , link(oprobit) .

78

2.8

, , , , , . , .
2.8.1

,
, .
.
. , (. . , ;  , , ) ,
. , ,
.
-
(3SLS).
Stata

2.8.2

reg3 .


, :

P[y < m|x] = p

(2.104)

, , 5%
10% ( p = 0.05 0.1). , ()
, .

p = 0.5.
Stata

Stata qreg . qreg . . . ,


quantile() , p
.

79

,
(. (2.11)):
N
X

(2.105)

|yi xi | min

i=1

- .
2.8.3


 . , E[y|x], ,
y x, (., , . 2.3).
:

m(x)

=n

n
X

(2.106)

Wni (x)yi ,

i=1

Wni  , x
.
:

n
X

Wni (x)(yi m(xi ))2 min


m(x)

i=1

Stata

(2.107)

,  lowess (locally weighted smoothing) (Fox 1997, 1993).


Stata 8 lowess (  ksm ksm . . . , lowess ).


(local regression)  (rolling
regression). .
   . , , ; , .
80

( 1993):

Wni (x) = Khn (x xi )/fhn (x)


n
X
fhn (x) = n1
Khn (x xi )

(2.108)

Khn (u) = h1
K(u/hn )
Z n
K(u)du = 1

(2.110)

(2.109)

i=1

(2.111)

(2.109)  () ( - ), (2.110)  hn (
). (2.109) ,
.
- .
:
: K(u) = 0.75(1 u2 )I(|u| 1)
15
: K(u) =
(1 u2 )2 I(|u| 1)
16
1
: K(u) = I(|u| 1)
2
: K(u) = (1 |u|)I(|u| 1)
1
() : K(u) = exp[u2 /2]
2

(2.112)
(2.113)
(2.114)
(2.115)
(2.116)

I( )  , 1,
, 0, .
:
 ? ?,
 
? ?. , , ,

. hn ,
, ,
( h ,
f (x) y). , , .
81


n4/9 (. . , ),
n1/9 .
Stata

kernreg ,
STB-30. (
, , , , , , ), , ,
.
kdensity , STB,
Stata.

. ,
,  , 
(,  , dimensionality curse), .
, , , ,
.
Stata

, .

. ,
: , . .,
. 2.3.

82

3
Stata
0

Stata (StataCorp. 1999, 2001)


: , , , . 80-
. 1999 . , 2000 .  1 ,
2002 . Stata Special Edition, . 2003
. Stata, ,
.
Stata :

( ,
, , , , );
0 .
Stata 6. ,
.

version ,

. , , 
( 32 ), ;
("" ) SMCL (Stata Markup and Control Language), (HTML,
SGML); (-) , .. ( ..); ; ; ,
.
http://www.stata.com/stata7 .

83

(. .
,
).
;
, ,
, ;
, Stata ( ); ;
, ;

(Windows, Macintosh, UNIX).
: ,

( Stata LATEX),
,
Harvard Graphics PowerPoint.
.
(, , )
Stata (, ). ,
, ;
Stata . (, ,
Stata). Stata
.
3.1

, Stata. , command  , ,
(, regress reg, regress). [ ]
84

 , . . ,  , . .
: [ 1 | 2 ]. ,
describe [ | using ] :

d
describe
describe x1 x2 x3
d using source
desc using source.dta
Stata .
Stata: [R] ,
(Reference); [U] 3
A brief description of Stata  ,
3 User's Guide ( Stata 6)  Stata
(, Stata ); [G] twoway 
.
3.2

: Stata

Stata c:/stata, . wstata.exe (Intercooled Stata for Windows)


wstatase.exe (Special Edition  ,
).
verinst .
(
200) .
.ado,
c:/stata/ado . ado-
( 900),
Stata, ( , Stata ado-); ,
Stata  Stata Technical Bulletin, STB,
Internet; ,
, .

85

Stata , , , ( [R] limits


help limits). :
set memory [k|m]
, Stata. 10 , : set memory 10m . : wstata /k 10240 .
2047 (32767 Stata SE), 2 109
. ,
( ), Stata ( ),
.
set matsize
, Stata .
10.  800. , Stata .

Stata , 2 ,
 (, , ). ( Windows) wstata /b do
.
Stata exit .
, Stata .
. : [U] 5 Starting and stopping Stata , [U] 6 Troubleshooting starting
and stopping Stata

3.3

, , : Stata

Stata (. 3.3) ,
. UNIX, Windows . Stata 8 
: ,
. , , ,  do-
2 . 3.13.

86

87
. 3.1: Stata.

.
Stata : (Stata Command),
(Stata Results), , (Review), (Variables), (Help),
(Graph), -, log- (Log; 7- Viewer). (Stata Browser)
(Stata Editor), (Stata Do-file Editor). ,
, Windows.
Stata Command Windows (, , ).
, PgUp PgDn,
( , , - ).
- Stata Prefs, ,
(, , ).
. : [GSW] , .. Getting Started for Windows.
3.4

: Stata

Stata, , :
[ ] [if ] [in ] [using ] [[]], []
(, ), (
) (, ). if in ,
(. 3.6). (,
..), , ,
using. , [weight= ] (. help
weights; ). , ,
Stata , ,
, .
, .. ,  . 3.11.
88

. : [U] 14 Language syntax


3.5

Windows- Stata
Help, Search ( , ,
Durbin Watson statistic ) Stata Command ( Stata).
, search,
help whelp.
Stata: http://www.stata.com/info/capabilities/ . Stata 8
findit, Stata (
).
Stata :
, , ,
3 ,
Stata. Internet, Stata
(MS Internet Explorer, Netscape Navigator). Stata 7
, Results.
, Stata, Help/Contents ( help contents).
: ,
, , , , ,
, Windows.

.hlp 4 .
Stata  - (,
, ), tutorial. ,
Stata, , Stata, Stata,
. , Stata 8 (
) .
3 , ,  ,
.

4 Windows , Stata,

Stata (Explorer) Windows .

89

. : [U] 8 Stata's on-line help and search facilities , [U] 9 Stata's on-line
tutorials and sample datasets .
3.6

Stata . [if ] [in ].


, if  , > (""), < (""), >= (" "), <=
(" "), == ("", , ), ! = = (" "); & (""), |
(""), ! (""), _n _N,
, . in
/, , ""( l) 1.
Stata 8 (missing values): < . < .a < . . . < .z (. .
, , ,
). .
( ). count if x<.
count if !mi(x) ,
x.
3.7

, , :

, , ,
. Stata
( infile; infix; insheet; . help dictionary [U] 24 Commands to input data ), (Excel, SAS, SPSS, Statistica
..) (
, , ), . Professional Stata Windows- StatTransfer ( http://www.stattransfer.com ),
.
90

 DBMS/COPY. , , , ,
.
Stata
File, .
use , [clear]
. use . . . , clear ,
, . (, , Windows ) ,
use using [if ] [in ],
/ ,
. 
, .. , do- (. 3.13),
.
save , [replace old]
. replace , , .  ,
. old - Stata 6 Stata 4-5
(.. Stata ). Stata 7
old Stata 6. Stata 8
Stata 7 saveold .
merge using , [nokeep ]
, . , . . . ,
( Stata master data using data) , . . ,
, , . [R]
sort sort . mmerge (Wessie 1999), (.
3.17). nokeep , ,
using data.
append using
, . . .

. : [U] 25 Commands for combining data

91

3.8

, , :

Stata . 
. (byte, int long,
) (float double),
; . [U] data types , help
datatypes. , , - ,
-

. g xx = sqrt(2)
. di xx*xx
1.9999999
float,
. , double ( set type double ).
generate [] = [if ] [in ]
, , , . Stata ,  32, ( ), , .
, , , (, , .), (
1  0  ) .
, 3.6. g byte nonmissx=x<. nonmiss byte (.. ),
1, x , 0, x .
g r = invnorm(uniform()) ,
N (0, 1). . [U] 14 Language syntax , [U] 15
Data, [16] Functions and expressions .
egen [ ] = egen-() [if ] [in ],
[by( )]
,
, , , , . . , -.
[R] egen help egen .
xi
xi: Stata

92

(0/1) , ,
. , , ..
.
i. , i. *i. i. * .

recode
. .
replace = [if ] [in ]
.
rename
.
drop if | in
, .
drop
.
list [] [if ] [in ]
( , ) , ( ,
).
edit [] [if ] [in ]
. Stata
- .
.
browse [ ] [if ] [in ]
. ,
edit , .
aorder
.
sort
gsort +|- . . .
.
compress [ ]
( , , )
, , .
reshape
, -

93

 , .  (long) , , ( ,   ,  ), 
(  ), , , . , income96, income97, income98 
 , income, year , year 96, 97, 98   . Stata,
xt, clogit  .
describe [ | using ], [short]
: , . . ,
, .
, .
label
. label variable ""
, describe . label data (
_dta ). use describe .
label define label values . :
generate egen label variable .
notes [_dta | ] : " "
.
label , notes
_dta .
:  ;  1994 .
;  households.do ..
lookfor
.
clear
, , , , .

94

3.9

summarize [if ] [in ], [detail ]


, , , , , , . detail
, . ,  lv; codebook inspect . ,
, tabulate table 
. .
correlate [if ] [in ], [covariance ]
.
covariance , . , . , ,
mat accum .
pwcorr [if ] [in ], sig obs
, . . , , .
sig (
), obs  .
tabulate table
, . .
tutorial tables . . [U] 28 Commands for dealing

with categorical variables

regress [if ] [in ], robust


noconst cluster( )
.
: ,
2 , , , F , R2 , Radj
, t- k = 0
(. . 49 ). robust (2.29), .
cluster , ( ). noconst
, , Stata, (
). regress -

95

, predict
, . tutorial regress .

Stata . , predict, , ;
(- e(b)) (e(V)); ( test) ( testnl,
- )
, .. ,
, estimates list ( Stata 8  ereturn list ). _b[ ],
 _se[ ]. ,
, help est help postest Stata.
Stata ,
5 :

ivreg, rreg,
reg3, nl;
( help time): arima; ac pac; arch; (2.31) newey; dfuller;
pperron;
( glm);
( anova; oneway; loneway),
( pca; factor);
( table;
tabulate; epitab);
( xt, , xtreg,
re xtreg, fe  ; xtgls  ; xtlogit xtprobit 
. . 2.7; help xt, [U] 29.13 Panel-data models );
5 .

help .

96

, ,
(survival time; st; . help st, [U] 29.14
Survival-time (failure time) models );
(survey; svy; . help svy, [U] 30 Overview of survey estimation );
( logit; logistic;
lfit; probit; dprobit  -);
( ttest), ( sdtest) (
signrank; signtest; ranksum; kwallis);
( spearman; ktau);
, ( ml);
Stata 7  cluster;
, .
Stata 500 ( ).
(STB), ( 2000 .)
SSC-IDEAS (. 3.16).
3.10

Stata : (, , , . .); ( , 0
1 uniform() KISS ( 2126 , 232

), , ,
(, ), ( _pi) . help functions [U] 16.3 Functions , [R] functions.
. 3.17.

97

3.11

Stata , ,
.  , ( .. , ).
Stata , , .. ,
, 6 , , Stata. , Stata (, , , , ado- ..).
Stata ($). ,
, ,
$S_level 95 ( ).
local
` '.
global $. :
. local a sqrt(2)
. local b = `a'
. di `a' _n `b' _n "`a' = `b'"
1.4142136
1.4142136
sqrt(2) = 1.414213562373095
. local i = 345
. local k = 1
. local i1 = 678
. di `i'`k'
3451
. di `i`k''
678
6 , .

98

;
, , ; ;
, ,
Stata; . . . . .
Stata .
syntax , .
. : [U] 21.3 Macros

Stata
, .
,
.
by () : Stata
Stata . , Stata
(), .
_N
. , , Stata .
bysort () ( ) : Stata
, by,
, .
for [: for . . . ]: Stata X [Y] [
\ Stata X [Y] . . . ]
: ( numlist ), ( varlist
), ( anylist ).
1 10 : 1(1)10 , 1 2 to 10 ,
1/10 .
, ,
. * : u* , "u".
: [U] 14 Language syntax , help numlist , help varlist .
for .
X (). for , Stata

99

X Y , ..

Stata
by for,
, ,
, Stata. ,
for by , Stata, quietly ,
: qui for var x1-x5: g lX=log(X) \ lab var lX "log of X"

forvalues foreach.
forvalues {
Stata
}
foreach of local | global | varlist | newlist | numlist {
Stata
}
foreach in {
Stata
}
forvalues foreach `' . , Stata ,
, .

:
foreach x in sqrt log exp {
forvalues k = 1/3 {
qui g `x'`k' = `x'(`k')
}
}

:
. sum sqrt1 - exp3
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------sqrt1 |
1
1
.
1
1
sqrt2 |
1
1.414214
.
1.414214
1.414214
sqrt3 |
1
1.732051
.
1.732051
1.732051
log1 |
1
0
.
0
0
log2 |
1
.6931472
.
.6931472
.6931472

100

-------------+-------------------------------------------------------log3 |
1
1.098612
.
1.098612
1.098612
exp1 |
1
2.718282
.
2.718282
2.718282
exp2 |
1
7.389056
.
7.389056
7.389056
exp3 |
1
20.08554
.
20.08554
20.08554

3.12

, Stata.
Stata , .
log using , [ append | replace ]
log on | off | close
, Stata ,
( , append
replace, ). log off ,
log on , log close .
, log-, Stata . Log- Stata, Stata ( , ..).
Stata 7 log-: ( ,
,
) ( ,
). log- cmdlog
using . log- HTML texman  log html log texman .
 outreg (Gallup 2001), : , , R2 .
, , , . [R] stb, help stb.
SSC-IDEAS, . 3.16.
, #review
[ ] .
. : [U] Printing and preserving output .
101

3.13

: do-

Stata 
 . , , .do, , do-, :
do , [nostop ]
Stata do-, .
, nostop.
, do run. , Stata
, , .
do- , C, . . /* , */  . ,
, *, . , , , , Stata
log-.
.
for , do X ( -
X) .
do- 7 .

, ,
, , do-.
,
,
.
,
(, ..), do-.
7 Stata Corp., Net Course 151 Stata.

102

(, reshape, merge

!) () . ,
, (, ).

label data notes.
Do-, , do-, , .
do- . ,
, RegData00, do-,
, cr-RegData00.do (. create), do,  an-RegData00.do .
do- " ", log-, , do-.
, log- do- (,
do- - , log-, Stata
Windows). version ,
.
"" do- :
clear
version 6
set memory 10m
log using income98, replace
use income98
* -
...
log close
exit

Stata Corporation Internet-


Stata. ,
.
. : [U] 19 Do-files
103

3.14

Stata graph,
.
.
graph , []

graph , .
tutorial graphics .
graph ,
. Stata
(bins), , , ,
graph . . . , bin(50).

graph
. . . , norm. , ,
graph . . . , box ( box-whisker,
9 ) | star ( ) | bar ( ) | pie (
). grhist
graph.
graph, : graph
 y  x. ( ), , :

symbol  , ; symbol(.)
, symbol(o)  , symbol([]) 
; symbol([_n])  .
connect  ; connect(.) , ,
connect(l)  ; connect(s) 
. (. 2.8.3).
, ,
8 Stata 7; Stata 8 , .

[G]

 .  ,

version 7 ,

9 (box) ,

, (whiskers)  .

104

: connect(l[-])  , connect(l[_])  , connect(l[.])  .  connect(l[-.]) - .

sort  , connect, x ( ).
bands  , .
, , .
density  . , .
bands.
xlab ylab  .
xtick ytick   .
xline yline  .
xscale yscale  .
title  . Stata .
grtwoway.
graph , Stata
, ..
y1 , . . . , yn1 , x.
graph, matrix.
Stata .gph, graph . . . , saving( ).
graph using () . Stata  .
help grother. , File , Windows- ( .bmp .wmf),
Windows.
Stata 6 LATEX Encapsulated PostScript (.eps) PDF LATEX graphicx.

105

, 7- , translate, PostScript Encapsulated PostScript,


SMCL-  .
. : [G]
3.15

, Stata, , .
query
( . . , . set
matsize , level , %,
log-, . .), , set (.
3.2).

about
Stata , :
, exe-, .

memory
, Stata . 1520 % ,
, , .

adopath
, Stata ado- (. . 85 ado-).
Stata (, STB- Internet, . 3.17), ado-.
which
, ado-,
, .
, ,
Stata .

106

3.16

: Internet-
Stata

Stata  http://www.stata.com/ . ( , Stata  STB,


, - ).  http://ideas.uqam.ca/ .
RePEc (Research Papers in Economics),
. RePEc SSC-IDEAS (Statistical Software Components), Stata.
,
.  statalist@hsphsun2.harvard.edu 10 ,
Stata, ,
(William Gould). ,
 .  ,
SAS.
, Stata , .  ,
.
update
Stata . update
query , ( ,
ado-, wstata.exe).
update ado , update executable update all .
net [from URL]
Stata Internet. (URL)  , , Stata 
, ,
.
webseek
Internet Stata,
. webseek Stata,
STB Stata,
10 ,

subscribe statalist .

107

majordomo@hsphsun2.harvard.edu

. webseek net search ,



findit .

, Internet, Stata
, , URL . ,

use http://www.stata.com/users/vwiggins/auto.dta
auto.dta, , ,
.
Stata infile, infix, insheet ,
.., do-, .
-
Prefs/General Preferences/Internet Prefs.
. : [U] 32 Using Internet to keep up to date .
3.17

: Stata

Stata  . Stata ado-, update,


. statalist SSC-IDEAS,
( Stata , , statalist).
Stata Stata Journal,
, , Stata.
Stata Technical Bulletin (, , STB). ,
,

net
net cd stb
Help/STB and User-written Programs ado- hlp Stata.
Stata
, SSC-IDEAS
, ,
108

adopath (. . 106), Stata  install. 6- 7- ,


, , , ,
install from a:.
, Windows UNIX. -, UNIX Windows, .
SSC-IDEAS , .
net Stata ,
 , Windows.
, , ,
 Stata
199 (unrecognized command: xyz not defined by xyz.ado  ; xyz xyz.ado); Stata
, . ,
(.ado .hlp) 
, .
(ado-).
. , , Stata, : http://www.komkon.org/~tacik/stata .
, (tutorials) PDF- .

egen. -
, . ,
, _g .
3.18

,
. .
- ;
109

, , . , Stata ,
, , estimates list (
regress . 96) results list . . help estimates help
results.
Stata : , ,
, . , do-, ;
 ,
--more-- ( ; Enter,  "", more UNIX);  () ;  (
);  . , [R] error messages Help/Search/ rc . ( =
if, -
, , , ..). , ,
 ,
  ( no room to add more variables , .
set memory).
Stata 7 : , URL Stata.
, ,
Stata , . , ,
, ,
.
. : [U] 11 Error messages and return codes

110

3.19

,
.

Stata , , , GAUSS. , , :
, , , ,
.   (
, , , ). Stata
help matrix.
. : [U] 17 Matrix expressions

, Stata ,
. , , ,
, -,
.
, Stata .
,
, ,
. , (
, ) .
(. . 3.3) - .

Stata  . [P] . program.  program define , ( end). 111

,  program drop _all


. program
define version
() syntax,
. . [P] syntax.
, , ,
set trace on. ,
Stata.
,
Stata. , . . [w]help
usersite, [R] net.
3.20

 ,
. ,
 (. 3.5,
help, winhelp).
 - tutorials. tutorial
Stata  Stata , -
. - tutorial intro ,
Stata. - 
, ,
,
, Stata, , .
, . http://www.komkon.org/~tacik/stata ,
- Stata (. 3.16), :

. net from http://www.komkon.org/~tacik/stata


. net get aboutreg
, , ,
, , . , ,
, 112

, - ,
Stata Corp., .
!

113

( -) RLMS (Russia Longitudinal Monitoring Survey, ;


. Mroz et. al (1999), Swafford (1996)).  , (-), , , , , . 2001 . .
19921993 ., . 1994 . ,
( 1997 .,
). 1 ftp- , ( RLMS)
http://www.cpc.unc.edu/rlms/ .
RLMS , ..
. RLMS ,
RLMS
. , ,

1989 ., . 1 SAS Transport files. StatTransfer, Professional Stata.

114

4.1: RLMS

5 6 7
4718

3973
11284

3781
10648

3750
10465

8
3831
10677

38

4.1.
RLMS 2 , .. . , .. , 1  , , .-.
(PSU)
. - ,
;
4.4% .
(PSU) 3 . (SSU, secondary sampling unit)
, ( ) 4 . , .
, RLMS . , RLMS
. , 89 ,
, ,  . ,
 , 
.
: RLMS , RLMS  .
2 . . 23.
3 PSU. , ,
PSU . . RLMS
.

115

: , . ,
. . , , ( , community data).
,

.
, , . ,
RLMS ,
. :

r#hh*  ;
r#he*  ;
r#in*  ;
r#*  ( , ,
..)
# ( j),
*  . ,
r7hhincm. 
. , pdf-
( ).
/ , merge. site# ( ), censusd# ( 
, , ), family# (
) person# (  ), #  - . , , ; ,
6- site6, censusd6, family6, person6 site, census, family, person ,
, . 5 aid, bid, cid did,
5 , ,
, , , -

116


. , ,
;
.
(,
svy* ) (, ,
). psu psu#,
.
RLMS . , RLMS 
() 6 .
( ,
, ). StatTransfer
.
RLMS ( ),
 :
1. , do-
, , .
, , , .
2. , 3, ( label data ) ( label variable ) , ( notes ).
Stata ,
( ) ...
(, ,
; ., (2000)). RLMS , , , RLMS
(.., , 2000 . ),
(.. , ,
), , , .

6   (Russian Economic

Trends); 1992 .

117

, , ,
RLMS - .

118

. , :
.
, ,
, , ,
.
, , , (1998) Greene (1997).
, , .
, , ,
,
.
, , -
.

119

6

.
,
.
, . ,
. , ,
.
(
 . ., . ., . . .
. ., , 1997) ( .., .. . .,
, 1999), .
, , ,
.
1. (, , , 1997) -, ?
?
? , ( )
( ).
2. x x2 , x3 , . . . ,
120

x , , . .
3. . .
Stata, .
1. regress Stata?
2. , ,
?
3. , - . ,
?
4. R2 , : 0.7315, 0.0082,
0.1041, 0.9989, 0.9305, 0.5000?
5. auto.dta . 2.32.8.
6. RLMS .
? ?

RLMS Stata, ,
, , .
.

RLMS , -

. ,
? ,
-
/ - ?

 .
, , , :
, ,
, ..
121


. ., . . , . . . .
. ., , 1983.
. ., . . . .
,
2000.
. ., . . . . .,
, 1998.
. . . ., , 1981.
. ., . . . ., , 1973.
., . . , . . . . . .,
, 1997.
. .,  , 1984.
. . ., , 1980.
. / . . . . .
/ . . . ., , 1989.
, . ., . . . . .,
-, 1998.
. . ., , 1993.
. . ., , 1984.
. . ., , 1980.

122

. . ., ,
1988.
Handbook of statistics. Volume 11. Econometrics. G.S. Maddala, C.R. Rao, H.D. Vinod
(eds.). North-Holland, 1993.
Handbook of econometrics, vol. 1 (ed. Z. Griliches, M. Intrilligator, 1983), 2 (ed. Z. Griliches,
M. Intrilligator, 1984), 3 (ed. Z. Griliches, M. Intrilligator, 1986), 4 (ed. R. Engle, D.
McFadden, 1994). Elsevier.
Baltagi, B. H. Econometric Analysis of Panel Data. John Wiley & Sons, 1995.
Dempster, A. P., M. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data
via the EM algorithm (with discussion). J. Royal Statist. Society , B39, 138 (1977).
Draper, N., H. Smith. Applied regression analysis. 3rd edition. Wiley, 1998 (
1- 2- : . , . .
.).
Efron, B. Bootstrap methods: Another look at the jacknife. Ann. Stat. , 7, 126, 1979.
Fox, J. Applied regression analysis, linear models, and related methods. SAGE, 1997.
Gallup, J. outreg  Formatting regression output. Stata Technical Bulletin , 46 (1998), 48
(1999), 58 (2000), 59 (2001).
Gould, W., W. Sribney. Maximum Likelihood Estimation with Stata. Stata Press, 1999.
Greene, W. H. Econometric Analysis . 3rd edition. Prentice Hall, 1997.
Hardin, J., Hilbe, J. Generalized Linear Models and Extensions . Stata Press, 2001.
Hausman, J. Specification Tests in Econometrics. Econometrica , 46, 12511271, 1978.
Kolenikov, S. Review of Stata 7. J. of Applied Econometrics , 16 (5), 637646, 2001.
Konishi, S., and G. Kitagawa. Generalized information criteria in model selection. Biometrika, 83 (4), 875890, 1996.
Little, R. J. A., and D. B. Rubin. Statistical Analysis with Missing Data. Wiley (1987).
Maddala, G. Limited Dependent and Qualitative Variables in Econometrics . Cambridge
Univ. Press, 1983.
123

Maddala, G. The Econometrics of Panel Data . Brookfield, 1993.


Mander, A., and D. Clayton. Hotdeck imputation. Stata Technical Bulletin , 51 (1999), 54
(2000).
Matyas, L., ed. Generalized method of moments estimation . Cambridge University Press,
1999.
McFadden, D. The Measurement of Urban Travel Demand, J. of Public Economics , 3, 303
328, 1974.
Nelder, J.A., McCullagh, P. Generalized Linear Models . CRC Press, 1989.
Mroz, T., D. Mancini, B. Popkin. Monitoring Economic Conditions in the Russian Federation. The Russia Longitudinal Monitoring Survey 199298. Report submitted to the
USAID. Carolina Population Center, University of North Carolina at Chapel Hill, 1999.
Newey, W. K., K. D. West. A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica , 55, 703708, 1987.
Neyman, J., and E. S. Pearson. On the use and interpretation of certain test criteria for
purposes of statistical inference. Biometrika , 20-A: 175247, 264299 (1928).
Rabe-Hesketh, S., Skrondal, A., Pickles, A. Reliable estimation of generalized linear mixed
models using adaptive quadrature. Stata Journal , 2 (1), 121, 2002.
Rubin, D. B. Inference and missing data. Biometrika , 63, 581592 (1976).
Rubin, D. B. Multiple imputations in sample surveys  a phenomenological Bayesian approach to nonresponse. Imputation and Editing of Faulty or Missing Survey Data . U.S.
Department of Commerce, pp. 123 (1978).
Smith, R., and K. Young. Linear Regression. Oxford University Press (2001).
StataCorp. Stata Statistical Software. Release 6 (1999). Release 7 (2001).
Swafford, M. Sample of the Russian Federation. Rounds V and VI of the Russian Longitudinal Monitoring Survey. Technical Report. Paragon Research International, 1996.
Wessie, J. mmerge  Safe and easy matched merging. Stata Technical Bulletin , 53 (1999).

124

Вам также может понравиться