Stata连享会 主页 || 视频 || 推文 || 知乎 || Bilibili 站
温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。
New!
lianxh
命令发布了:
随时搜索推文、Stata 资源。安装:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
连享会新命令:cnssc
,ihelp
,rdbalance
,gitee
,installpkg
⛳ Stata 系列推文:
作者:文海铭 (广西大学)
邮箱:hming_wen@sina.com
编者按:本文主要摘译自下文,特此致谢!
Source:Using the margins command with different functional forms: Proportional versus natural logarithm changes. -Link-
目录
margins
命令可以进行边际预测,以及边际效应计算,并且通过 expression()
选项,margins
可以用于任何函数形式。在本文中,我们将展示在线性模型和非线性模型中,如何计算协变量变化导致的结果变量的变化比例 (半弹性)。
二元变量的线性模型回归后:
我们可以使用 margins, eydx(x)
命令来估计
如果
然而,这并不是 margins, eydx(x)
的计算公式。相反,margins
命令以自然对数的形式得到
例如,我们将婴儿出生体重 (bw) 对母亲年龄 (age) 和是否存在 uterine irritability (ui) 进行回归,其中 age 是连续变量,ui 是二元变量。
. webuse lbw, clear
. regress bwt age i.ui
Source | SS df MS Number of obs = 189
-------------+---------------------------------- F(2, 186) = 8.63
Model | 8484309.72 2 4242154.86 Prob > F = 0.0003
Residual | 91430988.9 186 491564.456 R-squared = 0.0849
-------------+---------------------------------- Adj R-squared = 0.0751
Total | 99915298.6 188 531464.354 Root MSE = 701.12
------------------------------------------------------------------------------
bwt | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
age | 9.439 9.678 0.98 0.331 -9.653 28.531
1.ui | -569.193 143.966 -3.95 0.000 -853.208 -285.177
_cons | 2809.271 233.138 12.05 0.000 2349.337 3269.205
------------------------------------------------------------------------------
. margins, eydx(age ui)
Average marginal effects Number of obs = 189
Model VCE: OLS
Expression: Linear prediction, predict()
ey/dx wrt: age 1.ui
------------------------------------------------------------------------------
| Delta-method
| ey/dx std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
age | 0.003 0.003 0.97 0.331 -0.003 0.010
1.ui | -0.208 0.057 -3.65 0.000 -0.321 -0.096
------------------------------------------------------------------------------
Note: ey/dx for factor levels is the discrete change from the base level.
当然,我们也可以手动计算 age 导致 bwt 的变化程度,具体代码如下:
. gen propage = _b[age]/(_b[_cons] + _b[age]*age + _b[1.ui]*ui)
. sum propage
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
propage | 189 .003225 .0002682 .0029186 .0039631
ui 导致 bw 的变化程度计算代码如下:
. preserve
. replace ui = 0
. predict bwthat0
. replace ui = 1
. predict bwthat1
. gen propui = (bwthat1 - bwthat0)/bwthat0 // 错误的计算
. gen lnbwt0 = ln(bwthat0)
. gen lnbwt1 = ln(bwthat1)
. gen propui2 = lnbwt1 - lnbwt0 // 正确的计算
. sum propui propui2
. restore
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
propui | 189 -.187989 .0030706 -.1935099 -.1760018
propui2 | 189 -.2082484 .0037771 -.2150636 -.1935868
margins, eydx()
命令使用对数的方法,因为该方法具有更好的数值特性。如果结果变量本身就是以对数形式存在,例如:
则
现在,我将在模型中的 ui 替换为种族变量 race,其中 race = 1 代表白人,race = 2 代表黑人,race = 3 代表其他人种。
. reg bwt age i.race
Source | SS df MS Number of obs = 189
-------------+---------------------------------- F(3, 185) = 3.41
Model | 5239114.38 3 1746371.46 Prob > F = 0.0186
Residual | 94676184.2 185 511763.158 R-squared = 0.0524
-------------+---------------------------------- Adj R-squared = 0.0371
Total | 99915298.6 188 531464.354 Root MSE = 715.38
------------------------------------------------------------------------------
bwt | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
age | 6.147 10.069 0.61 0.542 -13.717 26.011
race |
Black | -366.394 160.569 -2.28 0.024 -683.176 -49.612
Other | -287.294 115.484 -2.49 0.014 -515.128 -59.460
_cons | 2953.688 255.247 11.57 0.000 2450.119 3457.257
------------------------------------------------------------------------------
. margins, expression((_b[2.race]*2.race + _b[3.race]*3.race)/(_b[_cons]+_b[age]*age)) dydx(race)
Average marginal effects Number of obs = 189
Model VCE: OLS
Expression: (_b[2.race]*2.race + _b[3.race]*3.race)/(_b[_cons]+_b[age]*age)
dy/dx wrt: 2.race 3.race
------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
race |
Black | -0.118 0.051 -2.34 0.019 -0.217 -0.019
Other | -0.093 0.036 -2.58 0.010 -0.163 -0.022
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
当然,我们可以通过如下公式进行手工计算,具体如下:
. gen double propblack = _b[2.race]/(_b[_cons] + _b[age]*age)
. gen double propother = _b[3.race]/(_b[_cons] + _b[age]*age)
. sum propblack propother
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
propblack | 189 -.1183368 .0012359 -.1205344 -.1134239
propother | 189 -.0927893 .0009691 -.0945124 -.0889371
该计算可推广到非线性模型,主要区别是回归命令。在下文中,我们使用 probit
和 poisson
进行回归。
. probit low age i.race
Probit regression Number of obs = 189
LR chi2(3) = 6.62
Prob > chi2 = 0.0850
Log likelihood = -114.02581 Pseudo R2 = 0.0282
------------------------------------------------------------------------------
low | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
age | -0.025 0.020 -1.26 0.209 -0.063 0.014
race |
Black | 0.454 0.288 1.58 0.114 -0.110 1.019
Other | 0.343 0.213 1.61 0.107 -0.074 0.760
_cons | -0.120 0.486 -0.25 0.805 -1.072 0.832
------------------------------------------------------------------------------
. margins, expression((normal(_b[_cons]+_b[age]*age+_b[2.race]*2.race + _b[3.race]*3.race) ///
> -normal(_b[_cons] + _b[age]*age))/normal(_b[_cons] + _b[age]*age)) dydx(race)
Average marginal effects Number of obs = 189
Model VCE: OIM
Expression: (normal(_b[_cons]+_b[age]*age+_b[2.race]*2.race + _b[3.race]*3.race)
-normal(_b[_cons] + _b[age]*age))/normal(_b[_cons] + _b[age]*age)
dy/dx wrt: 2.race 3.race
------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
race |
Black | 0.663 0.502 1.32 0.187 -0.321 1.646
Other | 0.487 0.365 1.33 0.182 -0.229 1.203
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
在 probit 回归中,我们通过使用母亲年龄和种族类别来计算低出生体重婴儿的概率。我们得到了黑人母亲与白人母亲、其他母亲与白人母亲的预测概率的比例变化。
. webuse dollhill3, clear
. poisson deaths smokes i.agecat, exposure(pyears)
Poisson regression Number of obs = 10
LR chi2(5) = 922.93
Prob > chi2 = 0.0000
Log likelihood = -33.600153 Pseudo R2 = 0.9321
------------------------------------------------------------------------------
deaths | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
smokes | 0.355 0.107 3.30 0.001 0.144 0.565
agecat |
45–54 | 1.484 0.195 7.61 0.000 1.102 1.866
55–64 | 2.628 0.184 14.30 0.000 2.267 2.988
65–74 | 3.350 0.185 18.13 0.000 2.988 3.713
75–84 | 3.700 0.192 19.25 0.000 3.323 4.077
_cons | -7.919 0.192 -41.30 0.000 -8.295 -7.543
ln(pyears) | 1.000 (exposure)
------------------------------------------------------------------------------
. margins, expression((exp(_b[smokes]*smokes+_b[2.agecat]*2.agecat+_b[3.agecat]*3.agecat ///
> +_b[4.agecat]*4.agecat+_b[5.agecat]*5.agecat+_ b[_cons])*pyears ///
> -exp(_b[smokes]*smokes+_ b[_cons])*pyears)/(exp(_b[smokes]*smokes+_b[_cons])*pyears)) ///
> dydx(agecat)
Average marginal effects Number of obs = 10
Model VCE: OIM
Expression: (exp(_b[smokes]*smokes+_b[2.agecat]*2.agecat+_b[3.agecat]*3.agecat
+_b[4.agecat]*4.agecat+_b[5.agecat]*5.agecat+_ b[_cons])*pyears
-exp(_b[smokes]*smokes+_ b[_cons])*pyears)/(exp(_b[smokes]*smokes+_b[_cons])*pyears)
dy/dx wrt: 2.agecat 3.agecat 4.agecat 5.agecat
------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
agecat |
45–54 | 3.411 0.861 3.96 0.000 1.724 5.097
55–64 | 12.839 2.543 5.05 0.000 7.856 17.823
65–74 | 27.517 5.270 5.22 0.000 17.188 37.846
75–84 | 39.451 7.776 5.07 0.000 24.211 54.691
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
在 poisson 回归中,死亡人数由吸烟状况和年龄类别解释。 预测死亡人数的比例变化是通过比较每个年龄组与基数计算出来的。margins
作用于结果的边际预测,其中预测在 ols 回归中等于 margins
计算每个观察的预测,并报告平均值作为预测边际。
margins, expression()
中的函数形式可以根据需要灵活设定。让我们回到 probit 回归。
. webuse lbw, clear
. probit low c.age##i.race
Probit regression Number of obs = 189
LR chi2(5) = 8.16
Prob > chi2 = 0.1475
Log likelihood = -113.25455 Pseudo R2 = 0.0348
------------------------------------------------------------------------------
low | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
age | -0.033 0.027 -1.24 0.215 -0.086 0.019
race |
Black | -0.971 1.282 -0.76 0.449 -3.483 1.541
Other | 0.433 1.044 0.41 0.678 -1.613 2.478
race#c.age |
Black | 0.065 0.056 1.15 0.249 -0.046 0.176
Other | -0.005 0.045 -0.10 0.917 -0.093 0.083
_cons | 0.090 0.652 0.14 0.891 -1.189 1.368
------------------------------------------------------------------------------
. margins, eydx(age)
Average marginal effects Number of obs = 189
Model VCE: OIM
Expression: Pr(low), predict()
ey/dx wrt: age
------------------------------------------------------------------------------
| Delta-method
| ey/dx std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
age | -0.032 0.023 -1.36 0.174 -0.078 0.014
------------------------------------------------------------------------------
下面是获得 margin, eydx(age)
的函数形式:
在 expression()
选项中,对于像线性预测这样的事情,不需要写出表达式。因此,上述比例变化可以通过如下方式计算得到:
. margins, expression((_b[age] + _b[2.race#c.age]*2.race + _b[3.race#c.age]* ///
> 3.race)*normalden(predict(xb)) / normal(predict(xb)))
Predictive margins Number of obs = 189
Model VCE: OIM
Expression: (_b[age] + _b[2.race#c.age]*2.race + _b[3.race#c.age]*
3.race)*normalden(predict(xb))/normal(predict(xb))
------------------------------------------------------------------------------
| Delta-method
| Margin std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_cons | -0.032 0.023 -1.36 0.174 -0.078 0.014
------------------------------------------------------------------------------
其中 predict(xb)
为快捷式,表示线性预测。此外,normal(predict(xb))
可以用 predict(pr)
代替,pr
代表阳性结果的概率。margins
中的 expression()
选项允许我们构造任何带有估计参数的函数,并将它们返回给 margins
。 然后,我们可以使用所有的计算和图形工具 marginsplot
来可视化结果。
. quietly margins r.race, expression((_b[age] + _b[2.race#c.age]*2.race ///
> + _b[3.race#c.age]*3.race)*normalden(predict(xb))/normal(predict(xb))) ///
> at(age=(14(5)50))
. quietly marginsplot, noci ytitle("expression in -margins-")
Note:产生如下推文列表的 Stata 命令为:
lianxh 边际效应, m
安装最新版lianxh
命令:
ssc install lianxh, replace
免费公开课
最新课程-直播课
专题 | 嘉宾 | 直播/回看视频 |
---|---|---|
⭐ 最新专题 | 文本分析、机器学习、效率专题、生存分析等 | |
研究设计 | 连玉君 | 我的特斯拉-实证研究设计,-幻灯片- |
面板模型 | 连玉君 | 动态面板模型,-幻灯片- |
面板模型 | 连玉君 | 直击面板数据模型 [免费公开课,2小时] |
⛳ 课程主页
⛳ 课程主页
关于我们
课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法
等
连享会小程序:扫一扫,看推文,看视频……
扫码加入连享会微信群,提问交流更方便
✏ 连享会-常见问题解答:
✨ https://gitee.com/lianxh/Course/wikis
New!
lianxh
和songbl
命令发布了:
随时搜索连享会推文、Stata 资源,安装命令如下:
. ssc install lianxh
使用详情参见帮助文件 (有惊喜):
. help lianxh