如何在SAS Enterprise Guide的一个PROC SQL查询中按其他列中的值计算均值值?

huangapple go评论59阅读模式
英文:

How to calculate mean values per values in other column in one PROC SQL query in SAS Enterprise Guide?

问题

我需要修改我的PROC SQL查询,以便在结果表中添加一个名为"MEAN_1"的列,该列将根据"VALUE"列对每个国家("COUNTRY"列)的值求平均。

以下是修改后的查询:

proc sql;
create table work.my_table as 
select 
    a.ID,
    a.COUNTRY,
    a.VALUE,
    b.MEAN_1
from library1.table1 a
left join (
    select COUNTRY, mean(VALUE) as MEAN_1
    from library1.table1
    group by COUNTRY
) b
on a.COUNTRY = b.COUNTRY;
quit;

这个查询首先创建一个子查询,该子查询计算每个国家的平均值,并将结果存储在"MEAN_1"列中。然后,原始表格与这个子查询进行左连接,以将平均值添加到每行记录中,从而得到所需的输出。

英文:

I have query in PROC SQL with result as table in SAS Enterprise Guide like below:

My query:

proc sql;
create table work.my_table as 
select 
ID
, COUNTRY
, VALUE
from library1.table1
;quit;

Result of query in PROC SQL:

ID  | COUNTRY   | VALUE
----|-----------|---------
111 | FRANCE    | 1
222 | GERMANY   | 3
333 | FRAMNCE   | 2
444 | FRAMCE    | 3
555 | GERMANY   | 5

My task:

I need to modify my query, so as to have one more column: "MEAN_1" where will be mean of value (columns: "VALUE") per country (column: "COUNTRY")

Desire output:

ID  | COUNTRY   | VALUE   | MEAN_1
----|-----------|---------|-------
111 | FRANCE    | 1       | 2
222 | GERMANY   | 4       | 4.5
333 | FRAMNCE   | 2       | 2
444 | FRAMCE    | 3       | 2
555 | GERMANY   | 5       | 4.5

Because:

  • values for FRANCE are: 1, 2, 3 --> so mean will be (1+2+3) / 3 = 2
  • values for GERMANY are: 4, 5 --> so mean will be (4+5) / 2 = 4.5

How can I modify my query in PROC SQL in SAS Enterprise Guide from the begining of my question, so as to have in result table like above in "Desire output" ?

答案1

得分: 4

你查询的结果(输入)与期望的输出不匹配。

  • id=222 的值不匹配。
  • "France" 拼写错误两次。

在其他 SQL 变种中,你可以使用子查询和连接来实现这一点。然而,在使用 SAS SQL 时,你不需要这样做,因为 SAS 允许你包含其他变量并自动重新合并数据。可以在这个答案中找到更详细的解释。

proc sql;
   create table want as
   select *,
          mean(value) as mean_value
   from have 
   group by country
   ;
quit;

使用双 DoW 循环

proc sort data=have; by country; run;

data want;
	do _n_ = 1 by 1 until (last.country);
		set have;
		by country;
		sum=sum(sum, value);
	end;

	mean_value=divide(sum, _n_);

	do until (last.country);
		set have;
		by country;
		output;
	end;

	drop sum;
run;

两种方法都产生相同的结果

id  country value mean_value
111 FRANCE    1       2
222 GERMANY   4      4.5
333 FRANCE    2       2
444 FRANCE    3       2
555 GERMANY   5      4.5
英文:

The result of your query (input) does not match the expected output.

  • Value for id=222 does not match
  • France is misspelled twice

In other flavors of SQL, you would achieve this with a subquery and a join.
However using SAS SQL you don't have to as SAS it allows you to include other variables and remerges the data together automatically. More in-depth explanation can be found in this answer.

proc sql;
   create table want as
   select *,
          mean(value) as mean_value
   from have 
   group by country
   ;
quit;

Using a double DoW Loop

proc sort data=have; by country; run;

data want;
	do _n_ = 1 by 1 until (last.country);
		set have;
		by country;
		sum=sum(sum, value);
	end;

	mean_value=divide(sum, _n_);

	do until (last.country);
		set have;
		by country;
		output;
	end;

	drop sum;
run;

Both hold the same result

id  country value mean_value
111 FRANCE    1       2
222 GERMANY   4      4.5
333 FRANCE    2       2
444 FRANCE    3       2
555 GERMANY   5      4.5

huangapple
  • 本文由 发表于 2023年2月24日 16:55:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75554454.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定