Stata:定制论文中表1-table1

发布时间:2022-08-27 阅读 1255

Stata连享会   主页 || 视频 || 推文 || 知乎 || Bilibili 站

温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。

New! lianxh 命令发布了:
随时搜索推文、Stata 资源。安装:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
连享会新命令:cnssc, ihelp, rdbalance, gitee, installpkg

课程详情 https://gitee.com/lianxh/Course

课程主页 https://gitee.com/lianxh/Course

⛳ Stata 系列推文:

PDF下载 - 推文合集

作者:姜昊 (华东师范大学)
邮箱HaoJiang0204@outlook.com


目录


1. 命令介绍

table1_mc 是 Phil Clayton 编写的外部命令,用于为论文制定一个特征性事实描述的表格。

* 命令安装
ssc install table1_mc, replace
* 命令语法
table1_mc [if] [in] [weight], vars(var_spec) [options]
var_spec = varname vartype [%fmt1 [%fmt2]] [ \ varname vartype [%fmt1 [%fmt2]] \ ...]

默认情况下,table1_mc 会输出指定变量的基线特征结果。var_spec 用于指定的变量集合,其中:

  • varname:指定单个变量,若进行多个变量的分析需要用反斜杠 \ 隔开;
  • vartype:指定描述变量的类型,且不可省略,否则代码报错。具体包括以下 7 种变量类型:
    • contn:用于服从正态分布的连续变量,返回均值和标准误;
    • contln:用于服从对数正态分布的连续变量,返回几何平均值和几何标准误;
    • conts:用于不服从正态分布与对数正态分布的连续变量,返回中位数与上下四分位数;
    • cat:类别变量,采用 Pearson 卡方检验组别差异;
    • cate:类别变量,采用 Fisher 精确检验组别差异;
    • bin:二分类变量,采用 Pearson 卡方检验组别差异;
    • bine:二分类变量,采用 Fisher 精确检验组别差异;
  • %fmt1:变量结果输出格式设定,参考 format 的输出语法;
  • %fmt2:变量其他结果输出格式设定,参考 format 的输出语法。

options 如下:

  • by(varname):分组变量,且 varname 必须是字符串或者数字,并且仅包含非负整数,无论是否增加值标签;
  • missing:对于 catcate 的类别变量,将缺失值视为一个新的类别;
  • test:结果包括描述显著性检验的方法;
  • statistic:结果包括描述检验统计量值的列;
  • percent:报告二 (多) 分类变量在所属组别的比重;
  • percent_n:以 %(n) 格式报告二 (多) 分类变量在所属组别的比重与个数;
  • slashN:以 n/N 替代 n (%) 的格式报告二 (多) 分类变量在所属组别的统计内容;
  • catrowperc:报告多分类变量在不同组别的行百分比;
  • pdp(#):设定 p 值小数位数;
  • saving(filename [, export_excel_options]):设定输出到 Excel 中的文件名与其他选项;
  • clear: 将 Stata 内存数据集用 table1_mc 结果替换。

2. 案例介绍

为了进一步直观感受各个选项的作用,下文将选取汽车数据 (auto.dta) 进行案例演示。具体地,按照汽车是否属于国产 (用 foreign 变量衡量),分别对服从正态分布的 weight、服从对数正态分布的 price、不服从正态分布与对数正态分布的 mpg、多分类变量 rep78 和二分类变量 much_headroom 进行分析。

. sysuse auto, clear
. generate much_headroom = (headroom>=3)
. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f /// 
>     \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace

  +--------------------------------------------+
  | factor               N_0   N_1   m_0   m_1 |
  |--------------------------------------------|
  | Weight (lbs.)         52    22     0     0 |
  |--------------------------------------------|
  | Price                 52    22     0     0 |
  |--------------------------------------------|
  | Mileage (mpg)         52    22     0     0 |
  |--------------------------------------------|
  | Repair record 1978    48    21     4     1 |
  |--------------------------------------------|
  | much_headroom         52    22     0     0 |
  +--------------------------------------------+
   N_ ... #records used below,   m_ ... #records not used
 
  +--------------------------------------------------------------+
  |                      Domestic        Foreign         p-value |
  |--------------------------------------------------------------|
  |                      N=52            N=22                    |
  |--------------------------------------------------------------|
  | Weight (lbs.)        3317 (695)      2316 (433)      <0.001  |
  |--------------------------------------------------------------|
  | Price                5534 (×/1.50)   5959 (×/1.44)    0.46   |
  |--------------------------------------------------------------|
  | Mileage (mpg)        19 (17-22)      25 (21-28)       0.002  |
  |--------------------------------------------------------------|
  | Repair record 1978                                   <0.001  |
  |    1                 2 (4%)          0 (0%)                  |
  |    2                 8 (17%)         0 (0%)                  |
  |    3                 27 (56%)        3 (14%)                 |
  |    4                 9 (19%)         9 (43%)                 |
  |    5                 2 (4%)          9 (43%)                 |
  |--------------------------------------------------------------|
  | much_headroom        35 (67%)        8 (36%)          0.014  |
  +--------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or 
median (IQR) for continuous measures, and n (%) for categorical measures.

增加 missing 选项,则变量 rep78 的缺失值被识别为新的类别。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
>     \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing

  +--------------------------------------------+
  | factor               N_0   N_1   m_0   m_1 |
  |--------------------------------------------|
  | Weight (lbs.)         52    22     0     0 |
  |--------------------------------------------|
  | Price                 52    22     0     0 |
  |--------------------------------------------|
  | Mileage (mpg)         52    22     0     0 |
  |--------------------------------------------|
  | Repair record 1978    52    22     0     0 |
  |--------------------------------------------|
  | much_headroom         52    22     0     0 |
  +--------------------------------------------+
   N_ ... #records used below,   m_ ... #records not used
 
  +--------------------------------------------------------------+
  |                      Domestic        Foreign         p-value |
  |--------------------------------------------------------------|
  |                      N=52            N=22                    |
  |--------------------------------------------------------------|
  | Weight (lbs.)        3317 (695)      2316 (433)      <0.001  |
  |--------------------------------------------------------------|
  | Price                5534 (×/1.50)   5959 (×/1.44)    0.46   |
  |--------------------------------------------------------------|
  | Mileage (mpg)        19 (17-22)      25 (21-28)       0.002  |
  |--------------------------------------------------------------|
  | Repair record 1978                                   <0.001  |
  |    1                 2 (4%)          0 (0%)                  |
  |    2                 8 (15%)         0 (0%)                  |
  |    3                 27 (52%)        3 (14%)                 |
  |    4                 9 (17%)         9 (41%)                 |
  |    5                 2 (4%)          9 (41%)                 |
  |    Missing           4 (8%)          1 (5%)                  |
  |--------------------------------------------------------------|
  | much_headroom        35 (67%)        8 (36%)          0.014  |
  +--------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) 
or median (IQR) for continuous measures, and n (%) for categorical measures.

增加 test 选项,每行结果后增加了显著性检验的方法。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
>     \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing test

  +--------------------------------------------+
  | factor               N_0   N_1   m_0   m_1 |
  |--------------------------------------------|
  | Weight (lbs.)         52    22     0     0 |
  |--------------------------------------------|
  | Price                 52    22     0     0 |
  |--------------------------------------------|
  | Mileage (mpg)         52    22     0     0 |
  |--------------------------------------------|
  | Repair record 1978    52    22     0     0 |
  |--------------------------------------------|
  | much_headroom         52    22     0     0 |
  +--------------------------------------------+
   N_ ... #records used below,   m_ ... #records not used
 
  +-----------------------------------------------------------------------------------+
  |                  Domestic        Foreign         Test                     p-value |
  |-----------------------------------------------------------------------------------|
  |                  N=52            N=22                                             |
  |-----------------------------------------------------------------------------------|
  | Weight (lbs.)    3317 (695)      2316 (433)      Ind. t test              <0.001  |
  |-----------------------------------------------------------------------------------|
  | Price            5534 (×/1.50)   5959 (×/1.44)   Ind. t test, logged data  0.46   |
  |-----------------------------------------------------------------------------------|
  | Mileage (mpg)    19 (17-22)      25 (21-28)      Wilcoxon rank-sum         0.002  |
  |-----------------------------------------------------------------------------------|
  | Repair record 1978                               Fisher's exact           <0.001  |
  |    1             2 (4%)          0 (0%)                                           |
  |    2             8 (15%)         0 (0%)                                           |
  |    3             27 (52%)        3 (14%)                                          |
  |    4             9 (17%)         9 (41%)                                          |
  |    5             2 (4%)          9 (41%)                                          |
  |    Missing       4 (8%)          1 (5%)                                           |
  |-----------------------------------------------------------------------------------|
  | much_headroom    35 (67%)        8 (36%)         Chi-square                0.014  |
  +-----------------------------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or median (IQR) for 
continuous measures, and n (%) for categorical measures.

增加 statistic 选项,每行结果后增加了检验统计量值。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
>     \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing test statistic

  +--------------------------------------------+
  | factor               N_0   N_1   m_0   m_1 |
  |--------------------------------------------|
  | Weight (lbs.)         52    22     0     0 |
  |--------------------------------------------|
  | Price                 52    22     0     0 |
  |--------------------------------------------|
  | Mileage (mpg)         52    22     0     0 |
  |--------------------------------------------|
  | Repair record 1978    52    22     0     0 |
  |--------------------------------------------|
  | much_headroom         52    22     0     0 |
  +--------------------------------------------+
   N_ ... #records used below,   m_ ... #records not used
 
  +-----------------------------------------------------------------------------------------------+
  |                 Domestic       Foreign        Test                     Statistic      p-value |
  |-----------------------------------------------------------------------------------------------|
  |                 N=52           N=22                                                           |
  |-----------------------------------------------------------------------------------------------|
  | Weight (lbs.)   3317 (695)     2316 (433)     Ind. t test              t(72)=  6.25   <0.001  |
  |-----------------------------------------------------------------------------------------------|
  | Price           5534 (×/1.50)  5959 (×/1.44)  Ind. t test, logged data t(72)= -0.74    0.46   |
  |-----------------------------------------------------------------------------------------------|
  | Mileage (mpg)   19 (17-22)     25 (21-28)     Wilcoxon rank-sum        Z= -3.10        0.002  |
  |-----------------------------------------------------------------------------------------------|
  | Repair record 1978                            Fisher's exact           N/A            <0.001  |
  |    1            2 (4%)         0 (0%)                                                         |
  |    2            8 (15%)        0 (0%)                                                         |
  |    3            27 (52%)       3 (14%)                                                        |
  |    4            9 (17%)        9 (41%)                                                        |
  |    5            2 (4%)         9 (41%)                                                        |
  |    Missing      4 (8%)         1 (5%)                                                         |
  |-----------------------------------------------------------------------------------------------|
  | much_headroom   35 (67%)       8 (36%)        Chi-square               Chi2(1)=  6.08  0.014  |
  +-----------------------------------------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or median (IQR) for continuous
measures, and n (%) for categorical measures.

增加 saving 选项将结果保存至指定位置,并利用 clear 选项将 Stata 内存中数据用输出结果替换。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
>     \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace    ///
>     missing test statistic saving("Table 1.xlsx", replace) clear
file Table 1.xlsx saved

3. 相关推文

Note:产生如下推文列表的 Stata 命令为:
lianxh 统计, m
安装最新版 lianxh 命令:
ssc install lianxh, replace

相关课程

免费公开课

最新课程-直播课

专题 嘉宾 直播/回看视频
最新专题 文本分析、机器学习、效率专题、生存分析等
研究设计 连玉君 我的特斯拉-实证研究设计-幻灯片-
面板模型 连玉君 动态面板模型-幻灯片-
面板模型 连玉君 直击面板数据模型 [免费公开课,2小时]
  • Note: 部分课程的资料,PPT 等可以前往 连享会-直播课 主页查看,下载。

课程主页

课程主页

关于我们

  • Stata连享会 由中山大学连玉君老师团队创办,定期分享实证分析经验。
  • 连享会-主页知乎专栏,700+ 推文,实证分析不再抓狂。直播间 有很多视频课程,可以随时观看。
  • 公众号关键词搜索/回复 功能已经上线。大家可以在公众号左下角点击键盘图标,输入简要关键词,以便快速呈现历史推文,获取工具软件和数据下载。常见关键词:课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法

连享会小程序:扫一扫,看推文,看视频……

扫码加入连享会微信群,提问交流更方便

✏ 连享会-常见问题解答:
https://gitee.com/lianxh/Course/wikis

New! lianxhsongbl 命令发布了:
随时搜索连享会推文、Stata 资源,安装命令如下:
. ssc install lianxh
使用详情参见帮助文件 (有惊喜):
. help lianxh