英文:
Snowflake Snowpark Python: Group By and Concat
问题
我在Snowflake中有这个:
id_request | alert_code |
---|---|
100 | R70 |
100 | R69 |
100 | R54 |
101 | R24 |
101 | R93 |
我想把它变成这样:
id_request | alert_all |
---|---|
100 | R70, R69, R54 |
101 | R24, R93 |
我尝试写了下面的代码,但似乎有问题:
df_alerts_3 = df_alerts_2.groupBy('id_request')\
.agg(concat_ws(',', collect_list('alert_code')).alias('alert_all'))
非常感谢您的任何协助。
英文:
I have this in Snowflake
id_request | alert_code |
---|---|
100 | R70 |
100 | R69 |
100 | R54 |
101 | R24 |
101 | R93 |
I want to turn it into this
id_request | alert_all |
---|---|
100 | R70,R69,R54 |
101 | R24,R93 |
I tried writing this but it seems to be wrong
df_alerts_3 = df_alerts_2.groupBy('id_request')\
.agg(concat_ws(lit(','), array_agg('alert_code')).alias('alert_all'))
Thank you very much for any assistance
答案1
得分: 2
CONCAT_WS 在单个行的上下文中连接字符串。要在多个行之间连接字符串,您需要使用聚合函数 LISTAGG。
Snowpark 的等效函数是 snowflake.snowpark.functions.listagg:
返回用分隔符字符串分隔的连接输入值
df.group_by(df.col1).agg(listagg(df.col2, ",").within_group(df.col2.asc())) df.select(listagg(df["col2"], ",", False)
英文:
CONCAT_WS concatenates strings in context of a single row. To concatenate strings across multiple rows you need to use aggregate function LISTAGG.
Snowpark equivalent is snowflake.snowpark.functions.listagg:
> Returns the concatenated input values, separated by delimiter string
>
> df.group_by(df.col1).agg(listagg(df.col2. ",")).within_group(df.col2.asc())
> df.select(listagg(df["col2"], ",", False)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论