2023年6月8日 17:59:37go评论87阅读模式

英文:

Make a new column with list of unique values grouped by, or over, another column in Polars

问题

在 Polars 0.18.0 之前，我能够通过类型创建一个包含所有独特宝可梦的列。我看到 Expr.list() 已经重构为 implode()，但是我在使用新语法时遇到了困难，无法复制以下操作：

df.with_columns(lst_of_pokemon = pl.col('name').unique().list().over('Type 1'))

英文:

Before Polars 0.18.0, I was able to create a column with a list of all unique pokemon by type. I see Expr.list() has been refactored to implode(), but I'm having trouble replicating the following using the new syntax:

df.with_columns(lst_of_pokemon = pl.col('name').unique().list().over('Type 1'))

答案1

得分: 2

mapping_strategy= 参数已添加到 .over 中。

df = pl.from_repr(&quot;&quot;&quot;
┌──────┬──────┐
│ name ┆ type │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ a    ┆ 1    │
│ a    ┆ 1    │
│ a    ┆ 2    │
│ b    ┆ 2    │
│ c    ┆ 3    │
│ c    ┆ 3    │
└──────┴──────┘
&quot;&quot;&quot;)
df.with_columns(lst_of_pokemon = 
   pl.col(&#39;name&#39;).unique().over(&#39;type&#39;, mapping_strategy=&#39;join&#39;)
)

shape: (6, 3)
┌──────┬──────┬────────────────┐
│ name ┆ type ┆ lst_of_pokemon │
│ ---  ┆ ---  ┆ ---            │
│ str  ┆ i64  ┆ list[str]      │
╞══════╪══════╪════════════════╡
│ a    ┆ 1    ┆ [&quot;a&quot;]          │
│ a    ┆ 1    ┆ [&quot;a&quot;]          │
│ a    ┆ 2    ┆ [&quot;a&quot;, &quot;b&quot;]     │
│ b    ┆ 2    ┆ [&quot;a&quot;, &quot;b&quot;]     │
│ c    ┆ 3    ┆ [&quot;c&quot;]          │
│ c    ┆ 3    ┆ [&quot;c&quot;]          │
└──────┴──────┴────────────────┘

英文:

The mapping_strategy= argument for .over was added.

df = pl.from_repr(&quot;&quot;&quot;
┌──────┬──────┐
│ name ┆ type │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ a    ┆ 1    │
│ a    ┆ 1    │
│ a    ┆ 2    │
│ b    ┆ 2    │
│ c    ┆ 3    │
│ c    ┆ 3    │
└──────┴──────┘
&quot;&quot;&quot;)
df.with_columns(lst_of_pokemon = 
   pl.col(&#39;name&#39;).unique().over(&#39;type&#39;, mapping_strategy=&#39;join&#39;)
)

shape: (6, 3)
┌──────┬──────┬────────────────┐
│ name ┆ type ┆ lst_of_pokemon │
│ ---  ┆ ---  ┆ ---            │
│ str  ┆ i64  ┆ list[str]      │
╞══════╪══════╪════════════════╡
│ a    ┆ 1    ┆ [&quot;a&quot;]          │
│ a    ┆ 1    ┆ [&quot;a&quot;]          │
│ a    ┆ 2    ┆ [&quot;a&quot;, &quot;b&quot;]     │
│ b    ┆ 2    ┆ [&quot;a&quot;, &quot;b&quot;]     │
│ c    ┆ 3    ┆ [&quot;c&quot;]          │
│ c    ┆ 3    ┆ [&quot;c&quot;]          │
└──────┴──────┴────────────────┘

答案2

得分: 1

编辑：jqurios发布的答案更加优雅

我认为以下代码可以生成所需的结果：

df.join(
    df.groupby("Type 1", maintain_order=True).agg(pl.col("Name").alias("combined_names")),
    on="Type 1"
)

使用groupby和agg，您可以创建按'Type 1'分组的所有宝可梦名称的列表，然后将其与原始数据框连接起来。

英文:

Edit: the answer posted by jqurios is way more elegant

In think the following produces the required result:

df.join(
    df.groupby(&quot;Type 1&quot;, maintain_order=True).agg(pl.col(&quot;Name&quot;).alias(&quot;combined_names&quot;)),
    on=&quot;Type 1&quot;
)

With the groupby and agg you can create the lists with all the pokemon names per 'Type 1', which is joined to the original dataframe.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

创建一个新列，其中包含按另一列分组或汇总的唯一值列表。

问题

答案1

答案2

Python-polars：从系数、数值和（嵌套的）列表转换为加权数值

使用 SQL LAG() 函数

如何在 polars 表达式中访问列名？

如何在时间序列分析中计算每个时间点的唯一基因数？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。