Stata连享会 主页 || 视频 || 推文 || 知乎 || Bilibili 站
温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。
New!
lianxh
命令发布了:
随时搜索推文、Stata 资源。安装:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
连享会新命令:cnssc
,ihelp
,rdbalance
,gitee
,installpkg
⛳ Stata 系列推文:
作者:杨柳 (西北大学)
E-Mail: philoyl@163.com
目录
当有足够有效的工具变量时,方程中的参数可以被识别,在这样的情况下,使用 2SLS 法将得到唯一的估计结果。在计量经济分析中,当方程中的参数被识别时,我们就说方程是被识别的。在 IV 估计式中:
仅有当以下两个条件都满足时,
若
的秩 小于 ,则称方程是 识别不足 的,此时就无法用计量方法得到一致的估计结果。 若 的秩 等于 ,则称方程是 恰足识别 的。 若 的秩 大于 ,则称方程是 过度识别 的。
过度识别约束检验 是对 工具变量的外生性 进行检验。在 恰足识别 情况下,我们无法对工具变量的外生性进行直接检验, 但是在过度识别情况下,我们就可以检验多余的工具变量是否与干扰项
上式中
Hausman (1978) 提出:用恰足识别方程的工具变量的子集进行 2SLS 的估计结果 与 用所有工具变量进行 2SLS 的估计结果 进行比较,如果所有的工具变量都是有效的,那么这两个估计结果之间的差异就应当仅仅是抽样误差。 与检验变量是否是内生的情况类似,构建原始的 Hausman 统计量在计算上是复杂的,不过,我们可以使用一个简单的基于回归的检验过程来替代上面的检验,具体步骤如下:
在 同方差 情形下: (1). 使用所有的工具变量
进行 2SLS 回归,得到残差 ; (2). 将 对 进行 OLS 回归(包含常数项),得到 (假设 与 包括常数项,否则为 uncentered ); (3). 在原假设 和 假设 (Assumption 2SLS.3) 下,有 ,其中 为多余约束(多余工具变量)的个数, 。 为 Sargan 统计量。 (4). 如果我们拒绝了原假设,那就意味着我们必须重新审查选择的工具变量;如果我们不能拒绝原假设,我们就能够对整体的工具变量的有效性有一定的信心。当然,这个检验对于探测个别工具变量内生性的功效是较低的。
异方差 情形下的计算要稍微复杂些。
在 异方差 情形下: (1). 通过 2SLS 的第一阶段计算得到
; (2). 选择 的任意子集 ,维度为 ( 无论是哪些子集被选取出来,只要我们选择 个元素即可 ); (3). 将子集 中的每一个元素对 与 做回归并计算残差 ,维度为 ,即 个元素; (4). 将 1 对 做回归 (不包括常数项) 并计算残差平方和 与 。 (5). 渐进服从 分布,判断是否拒绝原假设。与同方差情形一样,如果我们拒绝了原假设,就意味着工具变量不是外生的;如果不能拒绝原假设,就意味着整体上工具变量是外生的。
在 Stata 中,可以使用命令自动实现上述检验。当 2SLS 回归做完之后,用命令
estat overid
即可对工具变量的外生性进行检验,接下来我们使用案例 1 的数据举例说明。
我们要检验的是:父亲与母亲的受教育年数 (
. use "D:\stata15\ado\personal\IV_2SLS\Data\mroz.dta", clear
. *-过度识别约束检验(**同方差情形下**)(手动计算)
. *-overidentifying restriction
. *-2SLS
. ivregress 2sls lwage $aa (educ = motheduc fatheduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 24.65
Prob > chi2 = 0.0000
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0312895 1.96 0.050 .0000704 .1227228
exper | .0441704 .0133696 3.30 0.001 .0179665 .0703742
expersq | -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154
_cons | .0481003 .398453 0.12 0.904 -.7328532 .8290538
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-1.计算残差(u的估计值uhat)
. cap drop uhat
. predict uhat, residual
(325 missing values generated)
. *-2.将残差与所有外生变量和工具变量做线性回归,得到R2的值
. reg uhat $aa motheduc fatheduc
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 0.09
Model | .170502977 4 .042625744 Prob > F = 0.9845
Residual | 192.849519 423 .455909028 R-squared = 0.0009
-------------+---------------------------------- Adj R-squared = -0.0086
Total | 193.020022 427 .452037522 Root MSE = .67521
------------------------------------------------------------------------------
uhat | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | -.0000183 .0133291 -0.00 0.999 -.0262179 .0261813
expersq | 7.34e-07 .0003985 0.00 0.999 -.0007825 .000784
motheduc | -.0066065 .0118864 -0.56 0.579 -.0299704 .0167573
fatheduc | .0057823 .0111786 0.52 0.605 -.0161902 .0277547
_cons | .0109641 .1412571 0.08 0.938 -.2666892 .2886173
------------------------------------------------------------------------------
. *-3.计算统计量NR2,即Sargan统计量
. gen sargan = e(N)*e(r2)
. *-或生成暂元在屏幕上显示
. scalar Sargan = e(N)*e(r2)
. dis "Sargan = " Sargan
Sargan = .37807101
. *-4.判断是否拒绝原假设H0:所有的外生变量与结构方程中的随机误差项u不相关
. *- NR2统计量服从χ2(q)分布,q为过度识别约束的个数,即多余工具变量的个数
. *- 在本例中,由于只有一个内生变量,但工具变量有motheduc与fatheduc两个,所以q=1
. *- 在5%的显著性水平上,χ2(1)=3.84,而NR2的值为 n_rsquare = 0.378071
. *- 因此,不能拒绝原假设,我们对整体上工具变量的外生性是有信心的。
. scalar pvalue = chiprob(1, Sargan) //与下一行命令等价
. * scalar pvalue = 1-chi2(1, Sargan)
. dis "p-value = " pvalue
p-value = .53863741
. *-过度识别约束检验(**同方差情形下**)(Stata自动计算)
. *-1.进行2SLS回归
. ivregress 2sls lwage $aa (educ = motheduc fatheduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 24.65
Prob > chi2 = 0.0000
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0312895 1.96 0.050 .0000704 .1227228
exper | .0441704 .0133696 3.30 0.001 .0179665 .0703742
expersq | -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154
_cons | .0481003 .398453 0.12 0.904 -.7328532 .8290538
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.过度识别约束检验
. estat overid
Tests of overidentifying restrictions:
Sargan (score) chi2(1) = .378071 (p = 0.5386)
Basmann chi2(1) = .373985 (p = 0.5408)
(2)假设在 异方差 情形下,在第一步进行 2SLS 回归时增加 robust
选项,之后使用 estat overid
命令进行检验,Stata 命令和结果如下所示:
. *-过度识别约束检验(**异方差情形下**)(手动计算)
. *-计算educ_hat
. reg educ motheduc fatheduc huseduc $aa
Source | SS df MS Number of obs = 753
-------------+---------------------------------- F(5, 747) = 130.16
Model | 1820.49038 5 364.098077 Prob > F = 0.0000
Residual | 2089.54946 747 2.79725496 R-squared = 0.4656
-------------+---------------------------------- Adj R-squared = 0.4620
Total | 3910.03984 752 5.19952106 Root MSE = 1.6725
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
motheduc | .130004 .0223789 5.81 0.000 .086071 .1739371
fatheduc | .1013613 .0214423 4.73 0.000 .059267 .1434556
huseduc | .3715645 .0220465 16.85 0.000 .3282839 .414845
exper | .0532406 .0218443 2.44 0.015 .0103571 .0961241
expersq | -.0007403 .000708 -1.05 0.296 -.0021303 .0006497
_cons | 5.115778 .298017 17.17 0.000 4.530727 5.700828
------------------------------------------------------------------------------
. predict educ_hat, xb
. *-计算残差 r1 与 r2
. reg motheduc $aa educ_hat
Source | SS df MS Number of obs = 753
-------------+---------------------------------- F(3, 749) = 187.97
Model | 3662.73295 3 1220.91098 Prob > F = 0.0000
Residual | 4864.82881 749 6.49509854 R-squared = 0.4295
-------------+---------------------------------- Adj R-squared = 0.4272
Total | 8527.56175 752 11.3398428 Root MSE = 2.5485
------------------------------------------------------------------------------
motheduc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | -.1051582 .0337061 -3.12 0.002 -.1713279 -.0389885
expersq | .0015231 .0010851 1.40 0.161 -.000607 .0036532
educ_hat | 1.425138 .0608066 23.44 0.000 1.305767 1.54451
_cons | -7.412718 .7419038 -9.99 0.000 -8.869176 -5.956259
------------------------------------------------------------------------------
. predict r1, res
. reg fatheduc $aa educ_hat
Source | SS df MS Number of obs = 753
-------------+---------------------------------- F(3, 749) = 197.80
Model | 4242.00749 3 1414.0025 Prob > F = 0.0000
Residual | 5354.45466 749 7.14880462 R-squared = 0.4420
-------------+---------------------------------- Adj R-squared = 0.4398
Total | 9596.46215 752 12.7612529 Root MSE = 2.6737
------------------------------------------------------------------------------
fatheduc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | -.1068543 .0353617 -3.02 0.003 -.1762741 -.0374346
expersq | .0014908 .0011384 1.31 0.191 -.0007439 .0037256
educ_hat | 1.534126 .0637932 24.05 0.000 1.408891 1.65936
_cons | -9.170285 .7783437 -11.78 0.000 -10.69828 -7.64229
------------------------------------------------------------------------------
. predict r2, res
. *-计算2SLS残差uhat
. ivregress 2sls lwage $aa (educ = motheduc fatheduc huseduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 34.90
Prob > chi2 = 0.0000
R-squared = 0.1495
Root MSE = .66616
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0803918 .021672 3.71 0.000 .0379155 .1228681
exper | .0430973 .0132027 3.26 0.001 .0172204 .0689742
expersq | -.0008628 .0003943 -2.19 0.029 -.0016357 -.0000899
_cons | -.1868574 .2840591 -0.66 0.511 -.743603 .3698883
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc huseduc
. cap drop uhat
. predict uhat, res
(325 missing values generated)
. *-生成新变量uhat*r1与uhat*r2
. gen uhat_r1 = uhat * r1
(325 missing values generated)
. gen uhat_r2 = uhat * r2
(325 missing values generated)
. *-计算SSR
. gen one = 1
. reg one uhat_r1 uhat_r2, noconstant
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(2, 426) = 0.51
Model | 1.018745 2 .509372498 Prob > F = 0.6019
Residual | 426.981255 426 1.00230342 R-squared = 0.0024
-------------+---------------------------------- Adj R-squared = -0.0023
Total | 428 428 1 Root MSE = 1.0012
------------------------------------------------------------------------------
one | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uhat_r1 | -.0270098 .028959 -0.93 0.352 -.0839302 .0299106
uhat_r2 | -.0004977 .0307894 -0.02 0.987 -.0610157 .0600203
------------------------------------------------------------------------------
. predict e, res
(325 missing values generated)
. gen e2 = e*e
(325 missing values generated)
. sum e2
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
e2 | 428 .9976198 .0942201 .3649098 1.558913
. scalar SSR = r(N)*r(mean)
. dis "SSR = " SSR
SSR = 426.98126
. *-计算N-SSR与p-value
. *-结果为不能拒绝原假设H0,即工具变量是外生的
. scalar N_SSR = r(N)- SSR
. scalar pvalue = chiprob(2, N_SSR)
. dis "N_SSR = " N_SSR
N_SSR = 1.018745
. dis "p-value = " pvalue
p-value = .60087251
. *-过度识别约束检验(**异方差情形下**)(Stata自动计算)
. *-1.进行2SLS回归,增加异方差选项robust
. ivregress 2sls lwage $aa (educ = motheduc fatheduc), robust
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 18.61
Prob > chi2 = 0.0003
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0331824 1.85 0.064 -.0036397 .126433
exper | .0441704 .0154736 2.85 0.004 .0138428 .074498
expersq | -.000899 .0004281 -2.10 0.036 -.001738 -.00006
_cons | .0481003 .4277846 0.11 0.910 -.7903421 .8865427
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.过度识别约束检验
. estat overid
Test of overidentifying restrictions:
Score chi2(1) = .443461 (p = 0.5055)
在同方差或异方差的情形下,过度识别约束的检验结果表明我们无法拒绝原假设,表明整体上所有工具变量(父母的受教育年数)是外生的。
让我们从线性模型、里面存在单一的可能是内生变量的情形开始。为了清楚标记,我们定义
上式中
重要的是要记住上式的假设在整节内容中都成立。我们还要假设当
Hausman(1978)提出比较 OLS 和 2SLS 的估计量
为了进行基于回归的检验,我们写出包含误差项形式的
上式中
上式中
上式的关键是在模型设定上
由上式得到的 OLS估计量
由上式得到的 OLS 标准误不是有效的(除非
在 Stata 中,可以使用命令自动实现上述检验。当 2SLS 回归做完之后,使用
estat endog
命令即可检验内生变量是否是内生的,接下来我们举例说明。
我们要检验的是
(1)假设在 同方差 情形下,根据上述步骤,Stata 命令和结果如下所示:
. ***************单个内生变量*************************
. *-Hausman检验(**同方差情形下**)(手动计算)
. *-1.进行2SLS回归
. ivregress 2sls lwage $aa (educ = motheduc fatheduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 24.65
Prob > chi2 = 0.0000
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0312895 1.96 0.050 .0000704 .1227228
exper | .0441704 .0133696 3.30 0.001 .0179665 .0703742
expersq | -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154
_cons | .0481003 .398453 0.12 0.904 -.7328532 .8290538
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.获取参与回归的样本,给sample_2sls赋值为1
. gen sample_2sls = e(sample)
. *-3.对约减方程做回归:用内生变量作为因变量,所有外生变量和工具变量作为自变量
. reg educ $aa motheduc fatheduc if sample_2sls == 1
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 28.36
Model | 471.620998 4 117.90525 Prob > F = 0.0000
Residual | 1758.57526 423 4.15738833 R-squared = 0.2115
-------------+---------------------------------- Adj R-squared = 0.2040
Total | 2230.19626 427 5.22294206 Root MSE = 2.039
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0452254 .0402507 1.12 0.262 -.0338909 .1243417
expersq | -.0010091 .0012033 -0.84 0.402 -.0033744 .0013562
motheduc | .157597 .0358941 4.39 0.000 .087044 .2281501
fatheduc | .1895484 .0337565 5.62 0.000 .1231971 .2558997
_cons | 9.10264 .4265614 21.34 0.000 8.264196 9.941084
------------------------------------------------------------------------------
. *-4.计算上述约减方程的残差
. predict vhat_reducedeq, res
. *-5.将vhat_reducedeq加入到结构方程中进行回归
. *- 原假设H0:beta_vhat_reducedeq = 0(educ是外生变量)
. *- 若beta_vhat_reducedeq显著异于0,则拒绝原假设,表明educ是内生变量
. reg lwage $aa educ vhat_reducedeq if sample_2sls == 1
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 20.50
Model | 36.2573159 4 9.06432898 Prob > F = 0.0000
Residual | 187.070135 423 .442246183 R-squared = 0.1624
-------------+---------------------------------- Adj R-squared = 0.1544
Total | 223.327451 427 .523015108 Root MSE = .66502
--------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
exper | .0441704 .0132394 3.34 0.001 .0181471 .0701937
expersq | -.000899 .0003959 -2.27 0.024 -.0016772 -.0001208
educ | .0613966 .0309849 1.98 0.048 .000493 .1223003
vhat_reducedeq | .0581666 .0348073 1.67 0.095 -.0102501 .1265834
_cons | .0481003 .3945753 0.12 0.903 -.7274721 .8236727
--------------------------------------------------------------------------------
. *-Hausman检验(**同方差情形下**)(Stata自动计算)
. *-1.进行2SLS回归
. ivregress 2sls lwage $aa (educ = motheduc fatheduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 24.65
Prob > chi2 = 0.0000
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0312895 1.96 0.050 .0000704 .1227228
exper | .0441704 .0133696 3.30 0.001 .0179665 .0703742
expersq | -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154
_cons | .0481003 .398453 0.12 0.904 -.7328532 .8290538
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.Hausman检验
. estat endog
Tests of endogeneity
Ho: variables are exogenous
Durbin (score) chi2(1) = 2.80707 (p = 0.0938)
Wu-Hausman F(1,423) = 2.79259 (p = 0.0954)
(2)假设在 异方差 情形下,根据上述步骤,Stata 命令和结果如下所示:
. *-Hausman检验(**异方差情形下**)(手动计算)
. *-1.进行2SLS回归
. ivregress 2sls lwage $aa (educ = motheduc fatheduc)
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 24.65
Prob > chi2 = 0.0000
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0312895 1.96 0.050 .0000704 .1227228
exper | .0441704 .0133696 3.30 0.001 .0179665 .0703742
expersq | -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154
_cons | .0481003 .398453 0.12 0.904 -.7328532 .8290538
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.获取参与回归的样本,给sample_2sls赋值为1
. cap drop sample_2sls
. gen sample_2sls = e(sample)
. *-3.对约减方程做回归:用内生变量作为因变量,所有外生变量和工具变量作为自变量
. reg educ $aa motheduc fatheduc if sample_2sls == 1
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 28.36
Model | 471.620998 4 117.90525 Prob > F = 0.0000
Residual | 1758.57526 423 4.15738833 R-squared = 0.2115
-------------+---------------------------------- Adj R-squared = 0.2040
Total | 2230.19626 427 5.22294206 Root MSE = 2.039
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0452254 .0402507 1.12 0.262 -.0338909 .1243417
expersq | -.0010091 .0012033 -0.84 0.402 -.0033744 .0013562
motheduc | .157597 .0358941 4.39 0.000 .087044 .2281501
fatheduc | .1895484 .0337565 5.62 0.000 .1231971 .2558997
_cons | 9.10264 .4265614 21.34 0.000 8.264196 9.941084
------------------------------------------------------------------------------
. *-4.计算上述约减方程的残差
. cap drop vhat_reducedeq
. predict vhat_reducedeq, res
. *-5.将vhat_reducedeq加入到结构方程中进行回归
. *- 原假设H0:beta_vhat_reducedeq = 0(educ是外生变量)
. *- 使用稳健型标准误计算t统计量的值
. *- 若beta_vhat_reducedeq显著异于0,则拒绝原假设,表明educ是内生变量
. reg lwage $aa educ vhat_reducedeq if sample_2sls == 1, robust
Linear regression Number of obs = 428
F(4, 423) = 21.52
Prob > F = 0.0000
R-squared = 0.1624
Root MSE = .66502
--------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
exper | .0441704 .0151219 2.92 0.004 .0144469 .0738939
expersq | -.000899 .0004152 -2.16 0.031 -.0017152 -.0000828
educ | .0613966 .0326667 1.88 0.061 -.0028127 .125606
vhat_reducedeq | .0581666 .0364135 1.60 0.111 -.0134073 .1297406
_cons | .0481003 .4221019 0.11 0.909 -.7815781 .8777787
--------------------------------------------------------------------------------
. *-Hausman检验(**异方差情形下**)(Stata自动计算)
. *-1.进行2SLS回归,增加robust选项
. ivregress 2sls lwage $aa (educ = motheduc fatheduc), robust
Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(3) = 18.61
Prob > chi2 = 0.0003
R-squared = 0.1357
Root MSE = .67155
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0613966 .0331824 1.85 0.064 -.0036397 .126433
exper | .0441704 .0154736 2.85 0.004 .0138428 .074498
expersq | -.000899 .0004281 -2.10 0.036 -.001738 -.00006
_cons | .0481003 .4277846 0.11 0.910 -.7903421 .8865427
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
. *-2.Hausman检验
. estat endog
Tests of endogeneity
Ho: variables are exogenous
Robust score chi2(1) = 2.52857 (p = 0.1118)
Robust regression F(1,423) = 2.55166 (p = 0.1109)
在同方差假设情形下,内生性的检验结果表明在 10% 的显著性水平上可以拒绝原假设,表明在 10% 的显著性水平上,
下面将基于回归的 Hausman 检验拓展至多个内生变量的情形。定义
上式中,
并做一个标准
我们还可以使用
使用案例 2 中的 card.dta
数据。在工资方程中我们增加交乘项
上式中
. ***************多个内生变量*************************
. use "D:\stata15\ado\personal\IV_2SLS\Data\card.dta", clear
. *-手动计算
. *-Coviariates set up
. global cc "exper expersq south smsa reg661 reg662 reg663 reg664 reg665 reg666 reg667 reg668 smsa66"
. *-OLS
. reg lwage educ i.black##c.educ $cc
note: educ omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(16, 2993) = 80.83
Model | 178.817017 16 11.1760636 Prob > F = 0.0000
Residual | 413.824594 2,993 .138264148 R-squared = 0.3017
-------------+---------------------------------- Adj R-squared = 0.2980
Total | 592.641611 3,009 .196956335 Root MSE = .37184
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0707788 .0037548 18.85 0.000 .0634165 .0781411
1.black | -.4191076 .0794021 -5.28 0.000 -.5747958 -.2634194
educ | 0 (omitted)
|
black#c.educ |
1 | .0178595 .006271 2.85 0.004 .0055636 .0301554
|
exper | .0821556 .0066828 12.29 0.000 .0690522 .0952589
expersq | -.0021349 .0003207 -6.66 0.000 -.0027638 -.001506
south | -.1441927 .0259827 -5.55 0.000 -.1951384 -.093247
smsa | .1340695 .0200931 6.67 0.000 .0946718 .1734671
reg661 | -.1221745 .0388047 -3.15 0.002 -.1982611 -.046088
reg662 | -.0232881 .0282266 -0.83 0.409 -.0786336 .0320574
reg663 | .0230953 .0273506 0.84 0.399 -.0305325 .0767231
reg664 | -.0666851 .0356556 -1.87 0.062 -.1365971 .0032269
reg665 | .0032644 .03614 0.09 0.928 -.0675974 .0741261
reg666 | .0151249 .0401224 0.38 0.706 -.0635454 .0937952
reg667 | -.0074966 .0394073 -0.19 0.849 -.0847648 .0697716
reg668 | -.1757195 .0462851 -3.80 0.000 -.2664733 -.0849657
smsa66 | .0249824 .0194297 1.29 0.199 -.0131144 .0630793
_cons | 4.80677 .0752604 63.87 0.000 4.659202 4.954337
------------------------------------------------------------------------------
OLS 的结果表示黑人的教育回报率比非黑人的高1.79%。
为了检验
接下来进行联合
. ***************多个内生变量*************************
. use "D:\stata15\ado\personal\IV_2SLS\Data\card.dta", clear
. *-手动计算
. *-Coviariates set up
. global cc "exper expersq south smsa reg661 reg662 reg663 reg664 reg665 reg666 reg667 reg668 smsa66"
. *-OLS
. reg lwage educ i.black##c.educ $cc
note: educ omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(16, 2993) = 80.83
Model | 178.817017 16 11.1760636 Prob > F = 0.0000
Residual | 413.824594 2,993 .138264148 R-squared = 0.3017
-------------+---------------------------------- Adj R-squared = 0.2980
Total | 592.641611 3,009 .196956335 Root MSE = .37184
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0707788 .0037548 18.85 0.000 .0634165 .0781411
1.black | -.4191076 .0794021 -5.28 0.000 -.5747958 -.2634194
educ | 0 (omitted)
|
black#c.educ |
1 | .0178595 .006271 2.85 0.004 .0055636 .0301554
|
exper | .0821556 .0066828 12.29 0.000 .0690522 .0952589
expersq | -.0021349 .0003207 -6.66 0.000 -.0027638 -.001506
south | -.1441927 .0259827 -5.55 0.000 -.1951384 -.093247
smsa | .1340695 .0200931 6.67 0.000 .0946718 .1734671
reg661 | -.1221745 .0388047 -3.15 0.002 -.1982611 -.046088
reg662 | -.0232881 .0282266 -0.83 0.409 -.0786336 .0320574
reg663 | .0230953 .0273506 0.84 0.399 -.0305325 .0767231
reg664 | -.0666851 .0356556 -1.87 0.062 -.1365971 .0032269
reg665 | .0032644 .03614 0.09 0.928 -.0675974 .0741261
reg666 | .0151249 .0401224 0.38 0.706 -.0635454 .0937952
reg667 | -.0074966 .0394073 -0.19 0.849 -.0847648 .0697716
reg668 | -.1757195 .0462851 -3.80 0.000 -.2664733 -.0849657
smsa66 | .0249824 .0194297 1.29 0.199 -.0131144 .0630793
_cons | 4.80677 .0752604 63.87 0.000 4.659202 4.954337
------------------------------------------------------------------------------
. *-将内生变量与所有外生变量包括工具变量进行OLS回归,得到残差v21与v22
. reg educ $cc nearc4 i.black##i.nearc4
note: 1.nearc4 omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(16, 2993) = 170.69
Model | 10287.619 16 642.976186 Prob > F = 0.0000
Residual | 11274.4611 2,993 3.76694323 R-squared = 0.4771
-------------+---------------------------------- Adj R-squared = 0.4743
Total | 21562.0801 3,009 7.16586243 Root MSE = 1.9409
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | -.4125542 .033728 -12.23 0.000 -.4786866 -.3464218
expersq | .0008699 .0016525 0.53 0.599 -.0023703 .0041101
south | -.0517208 .1356037 -0.38 0.703 -.3176067 .2141651
smsa | .4021227 .104889 3.83 0.000 .1964609 .6077845
reg661 | -.2102379 .2025002 -1.04 0.299 -.6072915 .1868158
reg662 | -.2888672 .1473834 -1.96 0.050 -.5778502 .0001158
reg663 | -.2382962 .1427517 -1.67 0.095 -.5181975 .0416051
reg664 | -.0932447 .1862439 -0.50 0.617 -.4584237 .2719343
reg665 | -.4828321 .1882474 -2.56 0.010 -.8519394 -.1137248
reg666 | -.5129027 .2099523 -2.44 0.015 -.924568 -.1012373
reg667 | -.427108 .2056584 -2.08 0.038 -.8303541 -.023862
reg668 | .3135707 .2417323 1.30 0.195 -.1604075 .787549
smsa66 | .0254418 .1058119 0.24 0.810 -.1820295 .2329132
nearc4 | .3191761 .0978211 3.26 0.001 .1273726 .5109796
1.black | -.9374537 .147931 -6.34 0.000 -1.22751 -.6473969
1.nearc4 | 0 (omitted)
|
black#nearc4 |
1 1 | .0029741 .1767953 0.02 0.987 -.3436786 .3496267
|
_cons | 16.8492 .2149486 78.39 0.000 16.42774 17.27066
------------------------------------------------------------------------------
. predict v21, res
. gen black_educ = black*educ //计算black与educ交乘项的值
. reg black_educ $cc nearc4 i.black##i.nearc4
note: 1.nearc4 omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(16, 2993) = 3680.14
Model | 77916.1435 16 4869.75897 Prob > F = 0.0000
Residual | 3960.4957 2,993 1.32325282 R-squared = 0.9516
-------------+---------------------------------- Adj R-squared = 0.9514
Total | 81876.6392 3,009 27.2105813 Root MSE = 1.1503
------------------------------------------------------------------------------
black_educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0533248 .0199902 2.67 0.008 .0141289 .0925207
expersq | -.007937 .0009794 -8.10 0.000 -.0098574 -.0060166
south | -.252799 .0803708 -3.15 0.002 -.4103867 -.0952114
smsa | .1952868 .0621665 3.14 0.002 .0733934 .3171803
reg661 | .162124 .1200196 1.35 0.177 -.0732053 .3974534
reg662 | .0056958 .0873525 0.07 0.948 -.1655812 .1769729
reg663 | .0860648 .0846073 1.02 0.309 -.0798296 .2519592
reg664 | .113297 .1103847 1.03 0.305 -.1031406 .3297345
reg665 | .2615297 .1115721 2.34 0.019 .0427638 .4802956
reg666 | .3347247 .1244364 2.69 0.007 .0907352 .5787143
reg667 | .2962538 .1218915 2.43 0.015 .0572543 .5352533
reg668 | .0995837 .1432721 0.70 0.487 -.181338 .3805054
smsa66 | .0469365 .0627135 0.75 0.454 -.0760295 .1699025
nearc4 | -.0908895 .0579775 -1.57 0.117 -.2045693 .0227903
1.black | 11.5499 .0876771 131.73 0.000 11.37799 11.72182
1.nearc4 | 0 (omitted)
|
black#nearc4 |
1 1 | .874705 .1047846 8.35 0.000 .6692478 1.080162
|
_cons | .0948535 .1273977 0.74 0.457 -.1549425 .3446494
------------------------------------------------------------------------------
. predict v22, res
. *-同方差情形
. *-将残差v21与v22加入到结构方程模型中进行OLS回归,结果记为m1
. *-不加v21与v22时对结构方程模型进行OLS回归,结果记为m2
. *-使用ftest命令进行F检验,判断v21与v22的联合显著性
. *-ftest命令只适用于vce默认选项的情况下的回归
. *-若F检验结果为拒绝,则说明v21与v22联合时的系数估计值显著异于0,表示educ与educ*black是内生的
. *-若F检验结果为无法拒绝,则说明v21与v22联合时的系数估计值等于0,表示educ与educ*black是外生的
. reg lwage educ i.black##c.educ $cc v21 v22
note: educ omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(18, 2991) = 71.89
Model | 178.967071 18 9.94261506 Prob > F = 0.0000
Residual | 413.67454 2,991 .138306433 R-squared = 0.3020
-------------+---------------------------------- Adj R-squared = 0.2978
Total | 592.641611 3,009 .196956335 Root MSE = .3719
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1273556 .0547317 2.33 0.020 .02004 .2346712
1.black | -.282765 .4866263 -0.58 0.561 -1.236921 .6713912
educ | 0 (omitted)
|
black#c.educ |
1 | .0109036 .0387795 0.28 0.779 -.0651337 .0869408
|
exper | .1059116 .0241963 4.38 0.000 .0584685 .1533547
expersq | -.0022406 .0004635 -4.83 0.000 -.0031493 -.0013318
south | -.1424762 .0272675 -5.23 0.000 -.1959412 -.0890112
smsa | .1111556 .0304028 3.66 0.000 .0515431 .1707681
reg661 | -.1103479 .0410557 -2.69 0.007 -.1908481 -.0298477
reg662 | -.0081783 .0317789 -0.26 0.797 -.070489 .0541325
reg663 | .0382414 .0314436 1.22 0.224 -.0234119 .0998946
reg664 | -.0600379 .0368007 -1.63 0.103 -.1321951 .0121194
reg665 | .0337805 .0479745 0.70 0.481 -.060286 .1278469
reg666 | .0498975 .0537534 0.93 0.353 -.0554998 .1552948
reg667 | .0216942 .0501526 0.43 0.665 -.0766428 .1200312
reg668 | -.1908353 .0485659 -3.93 0.000 -.2860613 -.0956092
smsa66 | .0180009 .0207769 0.87 0.386 -.0227375 .0587393
v21 | -.0568274 .0548612 -1.04 0.300 -.1643969 .0507422
v22 | .0070106 .0392971 0.18 0.858 -.0700415 .0840627
_cons | 3.844991 .9314527 4.13 0.000 2.018638 5.671343
------------------------------------------------------------------------------
. est store m1
. reg lwage educ i.black##c.educ $cc
note: educ omitted because of collinearity
Source | SS df MS Number of obs = 3,010
-------------+---------------------------------- F(16, 2993) = 80.83
Model | 178.817017 16 11.1760636 Prob > F = 0.0000
Residual | 413.824594 2,993 .138264148 R-squared = 0.3017
-------------+---------------------------------- Adj R-squared = 0.2980
Total | 592.641611 3,009 .196956335 Root MSE = .37184
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0707788 .0037548 18.85 0.000 .0634165 .0781411
1.black | -.4191076 .0794021 -5.28 0.000 -.5747958 -.2634194
educ | 0 (omitted)
|
black#c.educ |
1 | .0178595 .006271 2.85 0.004 .0055636 .0301554
|
exper | .0821556 .0066828 12.29 0.000 .0690522 .0952589
expersq | -.0021349 .0003207 -6.66 0.000 -.0027638 -.001506
south | -.1441927 .0259827 -5.55 0.000 -.1951384 -.093247
smsa | .1340695 .0200931 6.67 0.000 .0946718 .1734671
reg661 | -.1221745 .0388047 -3.15 0.002 -.1982611 -.046088
reg662 | -.0232881 .0282266 -0.83 0.409 -.0786336 .0320574
reg663 | .0230953 .0273506 0.84 0.399 -.0305325 .0767231
reg664 | -.0666851 .0356556 -1.87 0.062 -.1365971 .0032269
reg665 | .0032644 .03614 0.09 0.928 -.0675974 .0741261
reg666 | .0151249 .0401224 0.38 0.706 -.0635454 .0937952
reg667 | -.0074966 .0394073 -0.19 0.849 -.0847648 .0697716
reg668 | -.1757195 .0462851 -3.80 0.000 -.2664733 -.0849657
smsa66 | .0249824 .0194297 1.29 0.199 -.0131144 .0630793
_cons | 4.80677 .0752604 63.87 0.000 4.659202 4.954337
------------------------------------------------------------------------------
. est store m2
. ftest m1 m2
Assumption: m2 nested in m1
F( 2, 2991) = 0.54
prob > F = 0.5814
. *-异方差情形
. *-将残差v21与v22加入到结构方程模型中进行OLS回归,加上robust选项
. *-test命令与ftest命令的功能是一样的,但它适用于所有vce的选项的情况下的回归
. *-使用test命令进行联合F检验,判断v21与v22的联合显著性
. *-若F检验结果为拒绝,则说明v21与v22联合时的系数估计值显著异于0,表示educ与educ*black是内生的
. *-若F检验结果为无法拒绝,则说明v21与v22联合时的系数估计值等于0,表示educ与educ*black是外生的
. reg lwage educ i.black##c.educ $ww v21 v22, robust
note: educ omitted because of collinearity
Linear regression Number of obs = 3,010
F(5, 3004) = 166.95
Prob > F = 0.0000
R-squared = 0.2174
Root MSE = .39292
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | -.0598108 .0081385 -7.35 0.000 -.0757685 -.0438531
1.black | -3.005263 .3148406 -9.55 0.000 -3.622588 -2.387938
educ | 0 (omitted)
|
black#c.educ |
1 | .2162143 .0253083 8.54 0.000 .1665909 .2658377
|
v21 | .130339 .0091265 14.28 0.000 .1124442 .1482339
v22 | -.1983001 .02635 -7.53 0.000 -.249966 -.1466343
_cons | 7.153203 .1111008 64.38 0.000 6.935362 7.371044
------------------------------------------------------------------------------
. test v21 v22
( 1) v21 = 0
( 2) v22 = 0
F( 2, 3004) = 117.35
Prob > F = 0.0000
. *-Stata自动计算
. gen black_nearc4 = black*nearc4
. *-同方差情形下
. ivregress 2sls lwage $cc (educ black_educ = nearc4 black_nearc4)
Instrumental variables (2SLS) regression Number of obs = 3,010
Wald chi2(15) = 677.65
Prob > chi2 = 0.0000
R-squared = 0.2336
Root MSE = .38846
------------------------------------------------------------------------------
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1346034 .0537559 2.50 0.012 .0292437 .2399631
black_educ | -.0111829 .0043351 -2.58 0.010 -.0196794 -.0026863
exper | .110093 .022998 4.79 0.000 .0650178 .1551682
expersq | -.0024288 .0003313 -7.33 0.000 -.0030782 -.0017795
south | -.1465937 .0274104 -5.35 0.000 -.2003172 -.0928702
smsa | .1123811 .0318409 3.53 0.000 .0499741 .1747882
reg661 | -.1051369 .0416016 -2.53 0.011 -.1866746 -.0235993
reg662 | -.0062918 .0327826 -0.19 0.848 -.0705446 .057961
reg663 | .0419279 .0315592 1.33 0.184 -.0199269 .1037827
reg664 | -.0557743 .0375173 -1.49 0.137 -.1293069 .0177582
reg665 | .0391725 .0471075 0.83 0.406 -.0531565 .1315014
reg666 | .0558905 .0528237 1.06 0.290 -.0476421 .1594231
reg667 | .0283201 .0487958 0.58 0.562 -.0673178 .1239581
reg668 | -.190215 .05078 -3.75 0.000 -.289742 -.090688
smsa66 | .019601 .021732 0.90 0.367 -.022993 .062195
_cons | 3.721227 .9143942 4.07 0.000 1.929047 5.513406
------------------------------------------------------------------------------
Instrumented: educ black_educ
Instruments: exper expersq south smsa reg661 reg662 reg663 reg664 reg665
reg666 reg667 reg668 smsa66 nearc4 black_nearc4
. estat endog
Tests of endogeneity
Ho: variables are exogenous
Durbin (score) chi2(2) = 1.75065 (p = 0.4167)
Wu-Hausman F(2,2992) = .870599 (p = 0.4188)
. *-异方差情形下
. ivregress 2sls lwage $cc (educ black_educ = nearc4 black_nearc4), robust
Instrumental variables (2SLS) regression Number of obs = 3,010
Wald chi2(15) = 752.86
Prob > chi2 = 0.0000
R-squared = 0.2336
Root MSE = .38846
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1346034 .0530173 2.54 0.011 .0306913 .2385155
black_educ | -.0111829 .0042026 -2.66 0.008 -.0194199 -.0029459
exper | .110093 .0227831 4.83 0.000 .0654388 .1547471
expersq | -.0024288 .0003479 -6.98 0.000 -.0031107 -.001747
south | -.1465937 .0290684 -5.04 0.000 -.2035668 -.0896207
smsa | .1123811 .0313138 3.59 0.000 .0510072 .173755
reg661 | -.1051369 .0409012 -2.57 0.010 -.1853019 -.024972
reg662 | -.0062918 .0337161 -0.19 0.852 -.0723742 .0597906
reg663 | .0419279 .0324651 1.29 0.197 -.0217024 .1055583
reg664 | -.0557743 .0392737 -1.42 0.156 -.1327493 .0212007
reg665 | .0391725 .049584 0.79 0.430 -.0580105 .1363554
reg666 | .0558905 .0526598 1.06 0.289 -.0473208 .1591019
reg667 | .0283201 .0499725 0.57 0.571 -.0696242 .1262645
reg668 | -.190215 .0509013 -3.74 0.000 -.2899798 -.0904503
smsa66 | .019601 .0206926 0.95 0.344 -.0209557 .0601577
_cons | 3.721227 .9006163 4.13 0.000 1.956051 5.486402
------------------------------------------------------------------------------
Instrumented: educ black_educ
Instruments: exper expersq south smsa reg661 reg662 reg663 reg664 reg665
reg666 reg667 reg668 smsa66 nearc4 black_nearc4
. estat endog
Tests of endogeneity
Ho: variables are exogenous
Robust score chi2(2) = 1.79498 (p = 0.4076)
Robust regression F(2,2992) = .892218 (p = 0.4099)
(1) 从理论上论证是否存在内生性问题,如有,则需说明内生性问题的来源; (2) 参考前期文献并结合自己的分析,选择合适的工具变量(不易); (3) 执行内生性检验,确认存在内生性问题,这与你选择的工具变量有关; (4) 执行过度识别检验,确认工具变量的合理性; (5) 完成第(4)步后,可能需要重新执行第(3)步; (6) 做第一阶段回归, 以便确认是否存在弱工具变量问题。
免费公开课
最新课程-直播课
专题 | 嘉宾 | 直播/回看视频 |
---|---|---|
⭐ 最新专题 | 文本分析、机器学习、效率专题、生存分析等 | |
研究设计 | 连玉君 | 我的特斯拉-实证研究设计,-幻灯片- |
面板模型 | 连玉君 | 动态面板模型,-幻灯片- |
面板模型 | 连玉君 | 直击面板数据模型 [免费公开课,2小时] |
⛳ 课程主页
⛳ 课程主页
关于我们
课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法
等
连享会小程序:扫一扫,看推文,看视频……
扫码加入连享会微信群,提问交流更方便
✏ 连享会-常见问题解答:
✨ https://gitee.com/lianxh/Course/wikis
New!
lianxh
命令发布了:
随时搜索连享会推文、Stata 资源,安装命令如下:
. ssc install lianxh
使用详情参见帮助文件 (有惊喜):
. help lianxh