Stata连享会 主页 || 视频 || 推文 || 知乎 || Bilibili 站
温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。
New!
lianxh
命令发布了:
随时搜索推文、Stata 资源。安装:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
连享会新命令:cnssc
,ihelp
,rdbalance
,gitee
,installpkg
⛳ Stata 系列推文:
作者:何庆红 (北京大学)
目录
相关阅读:
在「长宽数据」转换时,我们经常受困于以下问题:
long
+------------+ wide
| i j stub | +----------------+
|------------| | i stub1 stub2 |
| 1 1 4.1 | reshape |----------------|
| 1 2 4.5 | <---------> | 1 4.1 4.5 |
| 2 1 3.3 | | 2 3.3 3.0 |
| 2 2 3.0 | +----------------+
+------------+
为此,本文将通过对 reshape
命令介绍,来一一解答上述问题,以帮助大家更好的进行长宽数据转换。
reshape wide stub, i(i) j(j)
stub
为要转换的变量,可以单个也可以多个;i(i)
为 ID 变量,必须唯一;j(j)
为 stub
子集变量名。reshape long
reshape wide ...
后,实现快速转换。reshape long stub, i(i) j(j)
reshape wide
reshape long ...
后,实现快速转换。reshape error
值得注意的是:
@
的使用,来限定数值在变量名中位置,默认数值处于变量名结尾,例如: long
+------------+ wide
| i j stub | +----------------+
|------------| | i stu1b stu1b |
| 1 1 4.1 | reshape |----------------|
| 1 2 4.5 | <---------> | 1 4.1 4.5 |
| 2 1 3.3 | | 2 3.3 3.0 |
| 2 2 3.0 | +----------------+
+------------+
string
选项,以实现变量值为字符串、或变量名全为字符串的长变宽、或宽变长转换,例如: long
+------------+ wide
| i j stub | +----------------+
|------------| | i stubf stubm |
| 1 f 4.1 | reshape |----------------|
| 1 m 4.5 | <---------> | 1 4.1 4.5 |
| 2 f 3.3 | | 2 3.3 3.0 |
| 2 m 3.0 | +----------------+
+------------+
*-避免stata16不能正常调用数据
globa url "https://www.stata-press.com/data/r16/"
*-stata16以下版本
webuse reshape1
*-案例一:宽变长
use $url/reshape1, clear
list
reshape long inc ue, i(id) j(year)
list, sepby(id)
reshape wide //快速从长数据变宽数据
*reshape wide inc ue, i(id) j(year) //标准的从长数据变宽数据方法
+-------------------------------------------------------+
| id sex inc80 inc81 inc82 ue80 ue81 ue82 |
|-------------------------------------------------------|
1. | 1 0 5000 5500 6000 0 1 0 |
2. | 2 1 2000 2200 3300 1 0 0 |
3. | 3 0 3000 2000 1000 0 0 1 |
+-------------------------------------------------------+
+-----------------------------+
| id year sex inc ue |
|-----------------------------|
1. | 1 80 0 5000 0 |
2. | 1 81 0 5500 1 |
3. | 1 82 0 6000 0 |
|-----------------------------|
4. | 2 80 1 2000 1 |
5. | 2 81 1 2200 0 |
6. | 2 82 1 3300 0 |
|-----------------------------|
7. | 3 80 0 3000 0 |
8. | 3 81 0 2000 0 |
9. | 3 82 0 1000 1 |
+-----------------------------+
*-案例二:报告存在问题观测值
use $url/reshape2, clear
list
reshape long inc, i(id) j(year)
reshape error //列出存在问题的观察值
+----------------------------------+
| id sex inc80 inc81 inc82 |
|----------------------------------|
1. | 1 0 5000 5500 6000 |
2. | 2 1 2000 2200 3300 |
3. | 3 0 3000 2000 1000 |
4. | 2 0 2400 2500 2400 |
+----------------------------------+
variable id does not uniquely identify the observations
Your data are currently wide. You are performing a reshape long. You specified
i(id) and j(year). In the current wide form, variable id should uniquely identify
the observations. Remember this picture:
long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
Type reshape error for a list of the problem observations.
i (id) indicates the top-level grouping such as subject id.
The data are currently in the wide form; there should be a single
observation per i.
2 of 4 observations have duplicate i values:
+----+
| id |
|----|
2. | 2 |
3. | 2 |
+----+
(data now sorted by id)
*-案例三:占位符使用
use $url/reshape3, clear
list
*reshape long inc ue, i(id) j(year) //错误的方法
*list, sepby(id)
reshape long inc@r ue, i(id) j(year)
list, sepby(id)
reshape wide
*reshape wide inc@r ue, i(id) j(year)
+----------------------------------------------------------+
| id sex inc80r inc81r inc82r ue80 ue81 ue82 |
|----------------------------------------------------------|
1. | 1 0 5000 5500 6000 0 1 0 |
2. | 2 1 2000 2200 3300 1 0 0 |
3. | 3 0 3000 2000 1000 0 0 1 |
+----------------------------------------------------------+
+-----------------------------+
| id year sex incr ue |
|-----------------------------|
1. | 1 80 0 5000 0 |
2. | 1 81 0 5500 1 |
3. | 1 82 0 6000 0 |
|-----------------------------|
4. | 2 80 1 2000 1 |
5. | 2 81 1 2200 0 |
6. | 2 82 1 3300 0 |
|-----------------------------|
7. | 3 80 0 3000 0 |
8. | 3 81 0 2000 0 |
9. | 3 82 0 1000 1 |
+-----------------------------+
*-案例四:允许使用字符
use $url/reshape4, clear
list
reshape long inc, i(id) j(sex) string
list, sepby(id)
reshape wide
*reshape wide inc, i(id) j(sex) string
+-------------------------+
| id kids incm incf |
|-------------------------|
1. | 1 0 5000 5500 |
2. | 2 1 2000 2200 |
3. | 3 2 3000 2000 |
+-------------------------+
+------------------------+
| id sex kids inc |
|------------------------|
1. | 1 f 0 5500 |
2. | 1 m 0 5000 |
|------------------------|
3. | 2 f 1 2200 |
4. | 2 m 1 2000 |
|------------------------|
5. | 3 f 2 2000 |
6. | 3 m 2 3000 |
+------------------------+
*-案例五:long-long 与 wide-wide 转变 (两个 j 变量)
use $url/reshape5, clear
list
reshape wide @inc, i(hid year) j(sex) string
reshape wide minc finc, i(hid) j(year)
list
*reshape long minc finc, i(hid) j(year)
*reshape long @inc, i(hid year) j(sex) string
*list
+-------------------------+
| hid sex year inc |
|-------------------------|
1. | 1 f 90 3200 |
2. | 1 f 91 4700 |
3. | 1 m 90 4500 |
4. | 1 m 91 4600 |
+-------------------------+
+-----------------------------------------+
| hid finc90 minc90 finc91 minc91 |
|-----------------------------------------|
1. | 1 3200 4500 4700 4600 |
+-----------------------------------------+
*-gather 和 spread 的使用
sysuse educ99gdp.dta, clear
list
gather public private //快速宽变长
list
spread variable value //快速长变宽
+----------------------------------+
| country public private |
|----------------------------------|
1. | Australia .7 .7 |
2. | Britain .7 .4 |
3. | Canada 1.5 .9 |
4. | Denmark 1.5 .1 |
5. | France .9 .4 |
|----------------------------------|
6. | Germany .9 .2 |
7. | Ireland 1.1 .3 |
8. | Netherlands 1 .4 |
9. | Sweden 1.5 .2 |
10. | United States 1.1 1.2 |
+----------------------------------+
+----------------------------------+
| country variable value |
|----------------------------------|
1. | Australia public .7 |
2. | Australia private .7 |
3. | Britain public .7 |
4. | Britain private .4 |
5. | Canada public 1.5 |
|----------------------------------|
6. | Canada private .9 |
7. | Denmark public 1.5 |
8. | Denmark private .1 |
9. | France public .9 |
10. | France private .4 |
|----------------------------------|
11. | Germany public .9 |
12. | Germany private .2 |
13. | Ireland public 1.1 |
14. | Ireland private .3 |
15. | Netherlands public 1 |
|----------------------------------|
16. | Netherlands private .4 |
17. | Sweden public 1.5 |
18. | Sweden private .2 |
19. | United States public 1.1 |
20. | United States private 1.2 |
+----------------------------------+
Note:产生如下推文列表的 Stata 命令为:
lianxh
安装最新版lianxh
命令:
ssc install lianxh, replace
免费公开课
最新课程-直播课
专题 | 嘉宾 | 直播/回看视频 |
---|---|---|
⭐ 最新专题 | 文本分析、机器学习、效率专题、生存分析等 | |
研究设计 | 连玉君 | 我的特斯拉-实证研究设计,-幻灯片- |
面板模型 | 连玉君 | 动态面板模型,-幻灯片- |
面板模型 | 连玉君 | 直击面板数据模型 [免费公开课,2小时] |
⛳ 课程主页
⛳ 课程主页
关于我们
课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法
等
连享会小程序:扫一扫,看推文,看视频……
扫码加入连享会微信群,提问交流更方便
✏ 连享会-常见问题解答:
✨ https://gitee.com/lianxh/Course/wikis
New!
lianxh
命令发布了:
随时搜索连享会推文、Stata 资源,安装命令如下:
. ssc install lianxh
使用详情参见帮助文件 (有惊喜):
. help lianxh