2023年4月11日 00:33:22go评论79阅读模式

英文:

Replace a row in python polars

问题

我想用单个值替换polars DataFrame中的一行：

import numpy as np
import polars as pl

df = np.zeros(shape=(4, 4))
df = pl.DataFrame(df)

例如，我想将索引为1的行中的所有值替换为1.0。

我在文档中寻找了一个直接的解决方案，但未找到一个。

英文:

I want to replace a row in a polars DataFrame with a single value:

import numpy as np
import polars as pl

df = np.zeros(shape=(4, 4))
df = pl.DataFrame(df)

For example I want to replace all values in row at index 1 with 1.0 .

I was looking for a straightforward solution in the documentation, but I couldn't find one.

答案1

得分: 4

以下是您要的代码部分的中文翻译：

在 Polars 中，显式索引是一种反模式。尽管如此，使用 `with_row_count` 列，可以通过在 `when/then` 表达式中使用额外列来创建一个包含替代行值的新 DataFrame（最终结果中不会选择该列）：

df.with_row_count().select(
    pl.when(pl.col("row_nr") == 1)
      .then(1)
      .otherwise(pl.col(c))
    .alias(c) for c in df.columns
)

形状：(4, 4)
┌──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 │
│ ---      ┆ ---      ┆ ---      ┆ ---      │
│ i32      ┆ i32      ┆ i32      ┆ i32      │
╞══════════╪══════════╪══════════╪══════════╡
│ 0        ┆ 0        ┆ 0        ┆ 0        │
│ 1        ┆ 1        ┆ 1        ┆ 1        │
│ 0        ┆ 0        ┆ 0        ┆ 0        │
│ 0        ┆ 0        ┆ 0        ┆ 0        │
└──────────┴──────────┴──────────┴──────────┘

改进如下：

最近还有一个 cumcount，在基本情况下充当行计数表达式，从而保持整个查询的惰性。
pl.all 可以用于消除上面的生成器推导式，结合 keep_name 来避免重复列错误。

df.select(
    pl.when(pl.all().cumcount() == 1)
      .then(1)
      .otherwise(pl.all())
    .keep_name()
)

形状：(4, 4)
┌──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 │
│ ---      ┆ ---      ┆ ---      ┆ ---      │
│ f64      ┆ f64      ┆ f64      ┆ f64      │
╞══════════╪══════════╪══════════╪══════════╡
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
│ 1.0      ┆ 1.0      ┆ 1.0      ┆ 1.0      │
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
└──────────┴──────────┴──────────┴──────────┘

（可以从这里将结果转换为所需的数据类型）

英文:

It's an anti-pattern in Polars to explicitly index. That said, with a with_row_count column it is possible to make a new DataFrame with the replaced-by-row values, by using that extra column in a when/then expression (and not ultimately selecting it in the final result):

df.with_row_count().select(
    pl.when(pl.col(&quot;row_nr&quot;) == 1)
      .then(1)
      .otherwise(pl.col(c))
    .alias(c) for c in df.columns
)

shape: (4, 4)
┌──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 │
│ ---      ┆ ---      ┆ ---      ┆ ---      │
│ i32      ┆ i32      ┆ i32      ┆ i32      │
╞══════════╪══════════╪══════════╪══════════╡
│ 0        ┆ 0        ┆ 0        ┆ 0        │
│ 1        ┆ 1        ┆ 1        ┆ 1        │
│ 0        ┆ 0        ┆ 0        ┆ 0        │
│ 0        ┆ 0        ┆ 0        ┆ 0        │
└──────────┴──────────┴──────────┴──────────┘

EDIT: Two improvements:

There's also a fairly recent cumcount that acts as a row count
expression in the base case, effectively. This keeps the whole query
lazy.
pl.all can be used to get rid of the generator comprehension above, combined with a keep_name to avoid duplicate column errors.

df.select(
    pl.when(pl.all().cumcount() == 1)
      .then(1)
      .otherwise(pl.all())
    .keep_name()
)

shape: (4, 4)
┌──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 │
│ ---      ┆ ---      ┆ ---      ┆ ---      │
│ f64      ┆ f64      ┆ f64      ┆ f64      │
╞══════════╪══════════╪══════════╪══════════╡
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
│ 1.0      ┆ 1.0      ┆ 1.0      ┆ 1.0      │
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
│ 0.0      ┆ 0.0      ┆ 0.0      ┆ 0.0      │
└──────────┴──────────┴──────────┴──────────┘

(Can cast the result to whatever dtype from here)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Python Polars中替换一行。

问题

答案1

“variable for pdf file is referenced before assignment” 变量在赋值之前被引用。

从Neo4j使用Python驱动程序获取元素以及它们的ID。

如何在Python中在特定点停止递归函数的执行

VGG16迁移学习 – 未知指标函数：f1_score错误

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论