SAS: Generate separate plots for a list of variables in a single SGPLOT statement (such as can be done with PROC UNIVARIATE or PROC CHART)

huangapple go评论62阅读模式
英文:

SAS: Generate separate plots for a list of variables in a single SGPLOT statement (such as can be done with PROC UNIVARIATE or PROC CHART)

问题

一个数据集有10个变量('a'到'f'),一半是定量的,一半是分类的,您想要为每个变量生成单独的可视化表示(定量变量的直方图和分类变量的柱状图),并且希望用尽量少的语句。

PROC UNIVARIATE可以用于定量变量,并在HISTOGRAM语句中接受多个变量:

示例代码:

proc univariate data=DATA noprint;
histogram a b c d e;
run;

此代码将生成五个单独的直方图,每个变量一个。但是PROC UNIVARIATE无法为分类变量生成柱状图,而PROC FREQ只能生成表格。

PROC SGPLOT具有HISTOGRAM和HBAR/VBAR语句,但不像PROC UNIVARIATE那样接受多个变量参数。

无效示例:

proc sgplot data=DATA;
vbar f g h i j;
run;

这会在'f'和'g'之间的字符处引发错误:
ERROR 22-322: 语法错误,期望以下之一: ;, /。
ERROR 202-322: 选项或参数无法识别,将被忽略。

唯一的解决方案是为每个分类变量单独编写SGPLOT语句吗?

工作示例:每个语句单独一个变量

proc sgplot data=DATA;
vbar f;
run;

成功生成'f'的柱状图,并且

proc sgplot data=DATA;
vbar g;
run;

成功生成'g'的柱状图。

对于变量较少的数据集,这可能不困难,但对于大型数据集呢?

英文:

A dataset has 10 variables ('a' through 'f'), half quantitative and half categorical, and you want to generate a separate visual representation of each of these variables (histograms for the quantitative and bars for the categorical) with the minimum number of statements.

PROC UNIVARIATE can be used with quantitative variables and accepts multiple variables in the HISTOGRAM statement:

WORKING EXAMPLE: PROC UNIVARIATE takes multiple variables in the HISTOGRAM statement

proc univariate data=DATA noprint;
histogram a b c d e;
run;

which outputs five separate histograms, one for each variable. But PROC UNIVARIATE cannot output bar charts for the categorical variables, and PROC FREQ only has statements which output tables.

PROC SGPLOT has both the HISTOGRAM and HBAR/VBAR statements, but does not accept multiple variable arguments in the manner that PROC UNIVARIATE does.

INVALID EXAMPLE: SGPLOT does not accept multiple variables in the HISTOGRAM or HBAR/VBAR statements.

proc sgplot data=DATA;
vbar f g h i j;
run;

throws the following errors at the character between 'f' and 'g':
ERROR 22-322: Syntax error, expecting one of the following: ;, /.
ERROR 202-322: The option or parameter is not recognized and will be ignored.

Is the only solution to write a separate SGPLOT statement for each categorical variable as below?

WORKING EXAMPLES: Single variable per statement

proc sgplot data=DATA;
vbar f;
run;

successfully generates a VBAR for 'f', and

proc sgplot data=DATA;
vbar g;
run;

successfully generates a VBAR for 'g'.

For a dataset with few variables this may not be difficult, but what about for large datasets?

答案1

得分: 2

一种选项是使用宏语言生成一系列的SGPLOT步骤。因此,您编写一个宏,像这样调用:

%barchart(data=mydata,var=a b c)

另一种解决方案是将您的数据转置为垂直格式。所以,不是像这样的数据:

ID  A  B  C
1   10 11 12
2   20 21 22

而是将其转置为:

ID Var Value
1  A   10
1  A   11
1  A   12
2  B   20
2  B   21
2  B   22

这将允许您在SGPLOT中使用BY语句来生成多个图,例如:

proc sort data=have;
  by var;
run;

proc sgplot data=have;
  vbar value;
  by var;
run;
英文:

One option is to use the macro language to generate a bunch of SGPLOT steps. So you write a macro that you call like

%barchart(data=mydata,var=a b c)

Another solution is to transpose your data into a vertical format. So instead of data like:

ID  A  B  C
1   10 11 12
2   20 21 22

transpose it to:

ID Var Value
1  A   10
1  A   11
1  A   12
2  B   20
2  B   21
2  B   22

This would allow you do use a BY statement in SGPLOT to make multiple plots, e.g.

proc sort data=have;
  by var;
run;

pros sgplot data=have;
  vbar value;
  by var;
run; 

答案2

得分: 0

一个仅用于图形的简单宏,如果需要其他输出,可以修改ODS语句。

%macro summary_data(dsn=,
                    cat= /*分类变量,用空格分隔,变量简写可行*/,
                    cont= /*连续变量,用空格分隔,变量简写可行*/);
                    
ods select freqplot;                    
proc freq data=&dsn.;
table &cat / plots=freqplot;
run;

ods select histogram;
proc univariate data=&dsn.;
var &cont.;
histogram &cont.;
run;

%mend;

然后执行演示宏。

%summary_data(dsn=sashelp.class, cat=age sex, cont=age weight height);
英文:

A simple macro for just the graphs, if other output is desired, the ODS statement can be modified.


%macro summary_data(dsn=,
                    cat= /*categorical variables separated by spaces, variable shortcuts are fine*/,
                    cont = /*continuous variables separated by spaces, variable shortcuts are fine*/);
                    
ods select freqplot;                    
proc freq data=&dsn.;
table &cat / plots=freqplot;
run;

ods select histogram;
proc univariate data=&dsn.;
var &cont.;
histogram &cont.;
run;

%mend;

Then execute the macro for demo.

%summary_data(dsn=sashelp.class, cat = age sex, cont=age weight height);

huangapple
  • 本文由 发表于 2023年5月6日 18:12:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76188327.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定