连接并同时排序

huangapple go评论98阅读模式
英文:

Concatenate and sort at the same time

问题

我有一个相当简单的任务在Stata中遇到了困难。

我有三个变量 SicTwo1 SicTwo2 SicTwo3,它们都是数字(例如:"12"、"25" 和 "16")。

现在我想要将它们连接成一个新变量 SicAndSicAndSic,但它们应该按从最低到最高的值排序(例如:"121625"),最好还带有分隔符(例如:"12&16&25")。

我尝试了这段代码:

gen NumSics = 0 
replace NumSics = NumSics + 1 if !missing(SicTwo1)
replace NumSics = NumSics + 1 if !missing(SicTwo2) & SicTwo1 != SicTwo2
replace NumSics = NumSics + 1 if !missing(SicTwo3) & SicTwo1 != SicTwo2 & SicTwo1 != SicTwo3 & SicTwo2 != SicTwo3

sort SicTwo1 SicTwo2 SicTwo3

gen SicAndSic1 = string(SicTwo1)
gen SicAndSic2 = string(SicTwo1) + "&" + string(SicTwo2)
gen SicAndSic3 = string(SicTwo1) + "&" + string(SicTwo2) + "&" + string(SicTwo3)

gen SicAndSic = ""
replace SicAndSic = SicAndSic1 if NumSics == 1
replace SicAndSic = SicAndSic2 if NumSics == 2
replace SicAndSic = SicAndSic3 if NumSics == 3

但它并没有对变量进行排序,只是将它们放在一起。

英文:

I have a rather simple task which I am struggling with in Stata.

I have three variables SicTwo1 SicTwo2 SicTwo3, which are numeric (e.g. "12", "25", and "16")

I now want to concatenate them into a new variable SicAndSicAndSic, BUT they shall be ordered from lowest to highest value (e.g. "121625"), ideally with a separator (e.g. "12&16&25")

I tried this code:

gen NumSics = 0 
		replace NumSics = NumSics + 1 if !missing(SicTwo1)
		replace NumSics = NumSics + 1 if !missing(SicTwo2) & SicTwo1 != SicTwo2
		replace NumSics = NumSics + 1 if !missing(SicTwo3) & SicTwo1 != SicTwo2 & SicTwo1 != SicTwo3 & SicTwo2 != SicTwo3

		sort SicTwo1 SicTwo2 SicTwo3

		gen SicAndSic1 = string(SicTwo1)
		gen SicAndSic2 = string(SicTwo1) + "&" + string(SicTwo2)
		gen SicAndSic3 = string(SicTwo1) + "&" + string(SicTwo2) + "&" + string(SicTwo3)

		gen SicAndSic = ""
		replace SicAndSic = SicAndSic1 if NumSics == 1
		replace SicAndSic = SicAndSic2 if NumSics == 2
		replace SicAndSic = SicAndSic3 if NumSics == 3

But it does not sort the variables, and just puts them next to each other.

答案1

得分: 0

请查看https://www.stata-journal.com/article.html?article=pr0046,了解在观察(行)内对变量进行排序的一种方法。

clear 
input SicTwo1 SicTwo2 SicTwo3 
12 16 25 
23 12 11 
99 88 11 
end 

rowsort SicTwo?, gen(S1 S2 S3)

egen wanted = concat(S?) , p(&)

list 

     +-------------------------------------------------------+
     | SicTwo1   SicTwo2   SicTwo3   S1   S2   S3     wanted |
     |-------------------------------------------------------|
  1. |      12        16        25   12   16   25   12&16&25 |
  2. |      23        12        11   11   12   23   11&12&23 |
  3. |      99        88        11   11   88   99   11&88&99 |
     +-------------------------------------------------------+

您的代码显示了对sort的误解,sort按变量的值对观察进行排序,但绝对不会对观察内部进行排序,而这正是rowsort所做的,原始变量不会改变,结果会存储在新变量中。

您的变量被说明为数值型,因此默认情况下,任何缺失值都会被排序为高值。如果您想要其他结果,您需要明确说明。

英文:

See https://www.stata-journal.com/article.html?article=pr0046 for one way to sort variables within observations (rows).

clear 
input SicTwo1 SicTwo2 SicTwo3 
12 16 25 
23 12 11 
99 88 11 
end 

rowsort SicTwo?, gen(S1 S2 S3)

egen wanted = concat(S?) , p(&)

list 

     +-------------------------------------------------------+
     | SicTwo1   SicTwo2   SicTwo3   S1   S2   S3     wanted |
     |-------------------------------------------------------|
  1. |      12        16        25   12   16   25   12&16&25 |
  2. |      23        12        11   11   12   23   11&12&23 |
  3. |      99        88        11   11   88   99   11&88&99 |
     +-------------------------------------------------------+

Your code shows a misunderstanding of sort, which sorts observations by values of variables, but emphatically does not sort within observations -- which is precisely what does rowsort does, with the proviso that the original variables are unchanged, and the results go in new variables.

Your variables are stated to be numeric, so any missing values will by default be sorted to high. If you want something else, you need to spell out what that is.

huangapple
  • 本文由 发表于 2023年7月10日 23:27:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76655216.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定