如何在Polars中查找“非空”数据

huangapple go评论67阅读模式
英文:

How to find 'not null' data in polars

问题

以下是您提供的代码的翻译部分:

import polars as pl

data = {
    'd1': [20, 31, 56, 44, 10, None],
    'd2': [37, 15, 27, 36, None, None],
    'd3': [48, 4, None, 88, None, None],
    'd4': [50, None, None, 9, None, None],
}
data_df = pl.DataFrame(data)
print(data_df)

result = {
    'd1': [20, 31, 56, 44, 10, None],
    'd2': [37, 15, 27, 36, None, None],
    'd3': [48, 4, None, 88, None, None],
    'd4': [50, None, None, 9, None, None],
    'result': [50, 4, 27, 9, 10, None]
}
result_df = pl.DataFrame(result)
print(result_df)

请注意,这是您提供的代码的翻译部分,没有其他内容。如果您有任何其他需求,请告诉我。

英文:
import polars as pl

data = {
    'd1': [20, 31, 56, 44, 10, None],
    'd2': [37, 15, 27, 36, None, None],
    'd3': [48, 4, None, 88, None, None],
    'd4': [50, None, None, 9, None, None],
}
data_df = pl.DataFrame(data)
print(data_df)

result = {
    'd1': [20, 31, 56, 44, 10, None],
    'd2': [37, 15, 27, 36, None, None],
    'd3': [48, 4, None, 88, None, None],
    'd4': [50, None, None, 9, None, None],
    'result': [50, 4, 27, 9, 10, None]
}
result_df = pl.DataFrame(result)
print(result_df)

How to find the rightmost 'not null' data in each row.

As in the example, turn data_df into result_df.

Polars version = 0.17.12

答案1

得分: 1

pl.coalesce 用于获取第一个非空值。

你可以传递反转的列来获取最后一个。

df.with_columns(result = pl.coalesce(reversed(df.columns)))

也可以通过创建列表并且去掉空值来"手动"实现。

df.with_columns(result = 
   pl.concat_list(pl.all())
     .arr.eval(pl.element().drop_nulls())
     .arr.last()
)

也许另一个选择是使用pl.fold

df.with_columns(result = 
   pl.fold(
      acc=None, 
      exprs=pl.all(), 
      function=lambda left, right: right.fill_null(left)
   )
)
英文:

There is pl.coalesce to get the first non-null.

You can pass the reversed columns to get the last.

df.with_columns(result = pl.coalesce(reversed(df.columns)))
shape: (6, 5)
┌──────┬──────┬──────┬──────┬────────┐
│ d1   ┆ d2   ┆ d3   ┆ d4   ┆ result │
│ ---  ┆ ---  ┆ ---  ┆ ---  ┆ ---    │
│ i64  ┆ i64  ┆ i64  ┆ i64  ┆ i64    │
╞══════╪══════╪══════╪══════╪════════╡
│ 20   ┆ 37   ┆ 48   ┆ 50   ┆ 50     │
│ 31   ┆ 15   ┆ 4    ┆ null ┆ 4      │
│ 56   ┆ 27   ┆ null ┆ null ┆ 27     │
│ 44   ┆ 36   ┆ 88   ┆ 9    ┆ 9      │
│ 10   ┆ null ┆ null ┆ null ┆ 10     │
│ null ┆ null ┆ null ┆ null ┆ null   │
└──────┴──────┴──────┴──────┴────────┘

You can implement it "manually" by creating a list and dropping the nulls.

df.with_columns(result = 
   pl.concat_list(pl.all())
     .arr.eval(pl.element().drop_nulls())
     .arr.last()
)

Perhaps another option is to use pl.fold

df.with_columns(result = 
   pl.fold(
      acc=None, 
      exprs=pl.all(), 
      function=lambda left, right: right.fill_null(left)
   )
)

huangapple
  • 本文由 发表于 2023年5月7日 02:54:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76190590.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定