英文:
Self join - update a variable - proc sql
问题
proc sql;
create table MyTable2 as
select t1.*, t2.age as age_partner
from MyTable as t1
left join MyTable as t2
on t1.household = t2.household
and t1.flag = t2.flag
and t1.gender ne t2.gender;
quit;
input Household age gender $ flag $ age_partner;
datalines;
1 45 male married
1 35 female married 45
2 50 female married 52
2 52 male married 50
3 32 male child
3 50 male married 60
3 60 female married 50
;
You can achieve the desired output by using a single LEFT JOIN
within the PROC SQL
query. This will match each person with their spouse based on the same household and flag (married), and ensure that the age is associated with the correct gender.
英文:
I have this table:
data MyTable;
input Household age gender $ flag;
datalines;
1 45 male married
1 35 female married
2 50 female married
2 52 male married
3 32 male child
3 50 male married
3 60 female married
;
I have married couples and other persons in households; no intergenerational or same sex couples. I need to join a husband value to the wife and a wife value to the husband to get this output:
data MyTable2;
input Household age gender $ flag $ age_partner;
datalines;
1 45 male married 35
1 35 female married 45
2 50 female married 52
2 52 male married 50
3 32 male child
3 50 male married 60
3 60 female married 50
;
That does not work; it executes only the t2 part.
proc sql;
create table MyTable2 as
select t1.*, t2.age as age_partner, t3.age as age_partner
from MyTable as t1
left join (select household, age from MyTable where gender = 'male' & flag = 'married') as t2
on t1.household=t2.household & t1.gender = 'female' & t1.flag = 'married'
left join (select household, age from MyTable where gender = 'female' & flag = 'married') as t3
on t1.household=t3.household & t1.gender = 'male' & t1.flag = 'married';
quit;
input Household age gender $ flag $ age_partner;
datalines;
1 45 male married
1 35 female married 45
2 50 female married 52
2 52 male married
3 32 male child
3 50 male married
3 60 female married 50
;
This is the output. It joins the values only to the females.
If I separate these two joins, the last self join rewrites the previous join and in the output only the husbands will have the wife ages and the wives won't have the husband ages.
I can do it in separate columns and after that in the third column I make the needed output.
How do I do it in one proc?
答案1
得分: 2
假设你的逻辑是成立的(我怀疑这在各种数据质量原因下可能不成立)。
proc sql;
create table want as
select t1.*, case when t1.flag='married' then t2.age else . end as age_partner
from mytable as t1
left join (select * from mytable where flag='married') as t2
on t1.household=t2.household and t1.gender ne t2.gender;
quit;
英文:
Assuming your logic will hold (I suspect this will not hold for various data quality reasons).
proc sql;
create table want as
select t1.*, case when t1.flag='married' then t2.age else . end as age_partner
from mytable as t1
left join (select * from mytable where flag='married') as t2
on t1.household=t2.household and t1.gender ne t2.gender;
quit;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论