英文:
Choosing the row with the maximum character length in sas
问题
我有以下数据集:
dataseta:
No. Name1 Name2 Sales Inv Comp
1 TC Tribal Council Inc 100 100 0
2. TC Tribal Council Limited INC 20 25 65
期望的输出是:
datasetb:
No. Name1 Name2 Sales Inv Comp
1 TC Tribal Council Limited Inc 120 125 0
基本上,我需要选择具有列"Name2"中字符长度最长的行。我尝试了以下代码,但它没有起作用:
proc sql;
create table datasetb as select no,name1,name2,sum(sales),sum(inv),min(comp) from dataseta group by 1,2,3 having length(name2)=max(length(name2));quit;
如果我使用以下代码,它只部分解决了问题,并且我会得到重复的行:
proc sql;
create table datasetb as select no,name1,max(length(name2)),sum(sales),sum(inv),min(comp) from dataseta group by 1,2 having length(name2)=max(length(name2));quit;
英文:
I have the following dataset:
dataseta:
No. Name1 Name2 Sales Inv Comp
1 TC Tribal Council Inc 100 100 0
2. TC Tribal Council Limited INC 20 25 65
desired output:
datasetb:
No. Name1 Name2 Sales Inv Comp
1 TC Tribal Council Limited Inc 120 125 0
Basically, I need to choose the row with the maximum length of characters for the column name2.
I tried the following, but it didn't work
proc sql;
create table datasetb as select no,name1,name2,sum(sales),sum(inv),min(comp) from dataseta group by 1,2,3 having length(name2)=max(length(name2));quit;
If I do the following code, it only partially resolves it, and I get duplicate rows
proc sql;
create table datasetb as select no,name1,max(length(name2)),sum(sales),sum(inv),min(comp) from dataseta group by 1,2 having length(name2)=max(length(name2));quit;
答案1
得分: 0
你似乎正在将两个独立的聚合计算结果合并。
假设:
no
是唯一的,以允许决定胜负的标准,而第一个(按no
排序)最长的name2
将与name1
上的cost
、inv
、comp
总额合并。
这个查询将涉及很多操作...
-
在
name1
内第一个最长的name2
,需要嵌套子查询来实现:- 确定最长的
name2
,然后 - 根据
no
选择第一个,如果有多个。
- 确定最长的
-
在
name1
上的总计- 这些总计将是一个子查询,用于生成所需的结果集。
示例(SQL)
data have;
length no 8 name1 $6 name2 $35 sales inv comp 8;
input
no name1& name2& sales inv comp; datalines;
1 TC Tribal Council Inc 100 100 0 * name1=TC group
2 TC Tribal Council Limited INC 20 25 65
3 TC Tribal council co 0 0 0
4 TC The Tribal council Assoctn 10 10 10
7 LS Longshore association 10 10 0 * name=LS group
8 LS The Longshore Group, LLC 2 4 8
9 LS The Longshore Group, llc 15 15 6
run;
proc sql;
create table want as
select
first_longest_name2.no,
first_longest_name2.name1,
first_longest_name2.name2,
name1_totals.sales,
name1_totals.inv,
name1_totals.comp
FROM
(
select
no, name1, name2
from
( select
no, name1, name2
from have
group by name1
having length(name2) = max(length(name2))
) longest_name2s
group by name1
having no = min(no)
) as
first_longest_name2
LEFT JOIN
(
select
name1,
sum(sales) as sales,
sum(inv) as inv,
sum(comp) as comp
from
have
group by name1
) as
name1_totals
ON
first_longest_name2.name1 = name1_totals.name1
;
quit;
示例(DATA Step)
按连续的 name1
分组行的方式处理数据,可以使用 DOW 循环技术来实现 - 即在其中使用 SET
语句的循环。
data want2;
do until (last.name1);
set have;
by name1 notsorted;
if length(name2) > longest then do;
longest = length(name2);
no_at_longest = no;
name2_at_longest = name2;
end;
sales_sum = sum(sales_sum,sales);
inv_sum = sum(inv_sum,inv);
comp_sum = sum(comp_sum,comp);
end;
drop name2 no sales inv comp longest;
rename
no_at_longest = no
name2_at_longest = name2
sales_sum = sales
inv_sum = inv
comp_sum = comp
;
run;
英文:
You appear to be joining the results of two separate aggregate computations.
Presuming:
no
is unique so as to allow a tie breaker criteria and the first (per no
) longest name2
is to be joined with the cost
, inv
, comp
totals over name1
.
The query will have lots going on...
-
1st longest
name2
withinname1
, nested subqueries are needed to:- Determine the longest
name2
, then - Select first one, according to
no
, if more than one.
- Determine the longest
-
totals over
name1
- The totals will be a sub-query that is joined to, for delivering the desired result set.
Example (SQL)
data have;
length no 8 name1 $6 name2 $35 sales inv comp 8;
input
no name1& name2& sales inv comp; datalines;
1 TC Tribal Council Inc 100 100 0 * name1=TC group
2 TC Tribal Council Limited INC 20 25 65
3 TC Tribal council co 0 0 0
4 TC The Tribal council Assoctn 10 10 10
7 LS Longshore association 10 10 0 * name=LS group
8 LS The Longshore Group, LLC 2 4 8
9 LS The Longshore Group, llc 15 15 6
run;
proc sql;
create table want as
select
first_longest_name2.no,
first_longest_name2.name1,
first_longest_name2.name2,
name1_totals.sales,
name1_totals.inv,
name1_totals.comp
FROM
(
select
no, name1, name2
from
( select
no, name1, name2
from have
group by name1
having length(name2) = max(length(name2))
) longest_name2s
group by name1
having no = min(no)
) as
first_longest_name2
LEFT JOIN
(
select
name1,
sum(sales) as sales,
sum(inv) as inv,
sum(comp) as comp
from
have
group by name1
) as
name1_totals
ON
first_longest_name2.name1 = name1_totals.name1
;
quit;
Example (DATA Step)
Processing the data in a serial manner, when name1
groups are contiguous rows, can be accomplished using a DOW loop technique -- that is a loop with a SET
statement within it.
data want2;
do until (last.name1);
set have;
by name1 notsorted;
if length(name2) > longest then do;
longest = length(name2);
no_at_longest = no;
name2_at_longest = name2;
end;
sales_sum = sum(sales_sum,sales);
inv_sum = sum(inv_sum,inv);
comp_sum = sum(comp_sum,comp);
end;
drop name2 no sales inv comp longest;
rename
no_at_longest = no
name2_at_longest = name2
sales_sum = sales
inv_sum = inv
comp_sum = comp
;
run;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论