英文:
how to apply fill_null to a given set of columns on a lazyframe
问题
How do I fill nulls only in determined columns? Is there a way of implementing a subset?
Right now I am doing this:
df.withcolumns(pl.col("col1").fill_null(strategy='zero'),
pl.col("col2").fill_null(strategy="zero"))
I think there should be a way to do something like
df.fill_null(strategy='zero', subset=['col1', 'col2'])
英文:
How do I fill nulls only in determined columns? Is there a way of implementing a subset?
Right now I am doing this:
df.withcolumns(pl.col("col1").fill_null(strategy='zero'),
pl.col("col2").fill_null(strategy="zero"))
I think there should be a way to do something like
df.fill_null(strategy='zero', subset=['col1', 'col2'])
答案1
得分: 0
pl.col
更加多功能,它不仅接受一个列名,还可以处理多个列、正则表达式和通配符。详情请查阅pl.col的文档。
因此,您的问题变得非常简单:
import polars as pl
df = pl.DataFrame(
{
"a": [8, 9, 10, 11, None, None],
"b": [None, 4, 4, 4, None, 4],
}
)
df.with_columns(pl.col("a", "b").fill_null(strategy="zero"))
英文:
pl.col
is more versatile, it not just takes in one column, but multiple columns and regex and wildcards. For more check out the documentation of pl.col .
So your problem becomes straight forward
import polars as pl
df = pl.DataFrame(
{
"a": [8, 9, 10, 11, None, None],
"b": [None, 4, 4, 4, None, 4],
}
)
df.with_columns(pl.col("a", "b").fill_null(strategy="zero"))
shape: (6, 2)
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 8 ┆ 0 │
│ 9 ┆ 4 │
│ 10 ┆ 4 │
│ 11 ┆ 4 │
│ 0 ┆ 0 │
│ 0 ┆ 4 │
└─────┴─────┘
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论