问题

这是您提供的代码部分的翻译：

# 我的目标是按一个列（a列）进行分组/分区，创建一个字符串标识符（b和c列），然后将此b_c标识符用作透视数据帧中的列名。
# 根据我所知，下面的代码可以正常工作，但是获得结果的路径有点复杂。所以我的问题是：是否可以以更简单的方式完成这个任务？
# 顺便说一句，对于这个很小的规模（最多1k行），我不追求更快的速度。

data = {
    "a": [1, 1, 1, 2, 2, 3],
    "b": [11, 12, 13, 11, 12, 11],
    "c": ["x1", "x2", "x3", "x1", "x2", "x1"],
    "val": [101, 102, 102, 201, 202, 301],
}
df = pl.DataFrame(data)

print(df)

counter = 0
for tmp_df in df.partition_by("a"):
    grp_df = (
        tmp_df.with_columns((pl.col("b") + "_" + pl.col("c")).alias("col_id"))
        .drop(["b", "c"])
        .pivot(values="val", index="a", columns="col_id")
    )

    if counter == 0:
        result_df = grp_df.select(pl.all())
    else:
        result_df = pl.concat([result_df, grp_df], how="diagonal")
    counter += 1

print(result_df)

注意：这是您提供的代码的中文翻译，不包括其他类型的回答。

英文:

my goal was to groupby/partition by one column (a below), create a string identifier (b and c columns) then use this b_c identifier as a name for a column in a pivoted data frame.
Code below works OK as far as I can tell, but the path to get the result is a bit twisted. So my question is: can this be done in a simpler way?
BTW, at this tiny scale (max 1k of rows so far) I am not obsessed to make it faster.

data = {
    &quot;a&quot;: [1, 1, 1, 2, 2, 3],
    &quot;b&quot;: [11, 12, 13, 11, 12, 11],
    &quot;c&quot;: [&quot;x1&quot;, &quot;x2&quot;, &quot;x3&quot;, &quot;x1&quot;, &quot;x2&quot;, &quot;x1&quot;],
    &quot;val&quot;: [101, 102, 102, 201, 202, 301],
}
df = pl.DataFrame(data)

print(df)

counter = 0
for tmp_df in df.partition_by(&quot;a&quot;):
    grp_df = (
        tmp_df.with_columns((pl.col(&quot;b&quot;) + &quot;_&quot; + pl.col(&quot;c&quot;)).alias(&quot;col_id&quot;))
        .drop([&quot;b&quot;, &quot;c&quot;])
        .pivot(values=&quot;val&quot;, index=&quot;a&quot;, columns=&quot;col_id&quot;)
    )

    if counter == 0:
        result_df = grp_df.select(pl.all())
    else:
        result_df = pl.concat([result_df, grp_df], how=&quot;diagonal&quot;)
    counter += 1

print(result_df)

答案1

得分: 2

你可以分为两步完成：首先是选择步骤，创建新的 id 列，然后进行数据透视。

示例 1：

(
    df.select(
        'a', 'val',
        id=pl.col('b').cast(pl.Utf8) + '_' + pl.col('c'))
    .pivot(values='val', index='a', columns='id')
)

结果：
shape: (3, 4)
┌─────┬───────┬───────┬───────┐
│ a ┆ 11_x1 ┆ 12_x2 ┆ 13_x3 │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═══════╪═══════╪═══════╡
│ 1 ┆ 101 ┆ 102 ┆ 102 │
│ 2 ┆ 201 ┆ 202 ┆ null │
│ 3 ┆ 301 ┆ null ┆ null │
└─────┴───────┴───────┴───────┘

示例 2：
（由 @jqurious 提出），使用 pl.format

(
    df.select(
        'a', 'val',
        id=pl.format("{}_{}", "b", "c"))
    .pivot(values='val', index='a', columns='id')
)

英文:

you can do this in 2 step: first a select step to create the new id column, then the pivot.

Example 1:

(
    df.select(
        &#39;a&#39;,&#39;val&#39;,
        id = pl.col(&#39;b&#39;).cast(pl.Utf8) + &#39;_&#39; + pl.col(&#39;c&#39;))
    .pivot(values=&#39;val&#39;,index=&#39;a&#39;, columns=&#39;id&#39;)
)

# Result
shape: (3, 4)
┌─────┬───────┬───────┬───────┐
│ a   ┆ 11_x1 ┆ 12_x2 ┆ 13_x3 │
│ --- ┆ ---   ┆ ---   ┆ ---   │
│ i64 ┆ i64   ┆ i64   ┆ i64   │
╞═════╪═══════╪═══════╪═══════╡
│ 1   ┆ 101   ┆ 102   ┆ 102   │
│ 2   ┆ 201   ┆ 202   ┆ null  │
│ 3   ┆ 301   ┆ null  ┆ null  │
└─────┴───────┴───────┴───────┘

Example 2:
(suggested by @jqurious), using pl.format

(
    df.select(
        &#39;a&#39;,&#39;val&#39;,
        id = pl.format(&quot;{}_{}&quot;, &quot;b&quot;, &quot;c&quot;))
    .pivot(values=&#39;val&#39;,index=&#39;a&#39;, columns=&#39;id&#39;)
)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

python polars：使用枢轴和连接分区df

问题

答案1

I get this error when I try to convert a list from a Snowpark df to a Pandas df: AttributeError: 'list' object has no attribute 'to_pandas'

使用Beautiful Soup获取特定单词之后的文本。

如何在扫描电子显微镜图像中快速生成彩色像素而不是灰度像素的掩膜？

创建一个分类列，该列包括当前时间前1小时和后1小时的计数。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论