如何以列为单位,在 Polars 数据框中将所有列逐元素除以列特定的标量?

huangapple go评论54阅读模式
英文:

What is a good way to divide all columns element-wise by a column-specific scalar in a polars dataframe?

问题

我有一个包含一些整数值的 Polars 数据帧,例如:

>>> import polars as pl
>>> df = pl.DataFrame({'a':[3,3,3],'b':[6,6,9]})
>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 i64  i64 
╞═════╪═════╡
 3    6   
 3    6   
 3    9   
└─────┴─────┘

现在我想将数据帧中的数据相对于第一行的值进行转换,即列 a 中的所有值都除以 3,列 b 中的所有值都除以 6,以得到:

>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 f32  f32 
╞═════╪═════╡
 1.0  1.0 
 1.0  1.0 
 1.0  1.5 
└─────┴─────┘

使用 Polars 表达式来实现这个目标的一个好方法是什么?

我找不到 Polars Series 上的 divide 方法,也不清楚如何在 Polars with_columns 上下文中获取列中的第一个值。

英文:

I have a polars dataframe containing some columns with integer values in them, e.g.

>>> import polars as pl
>>> df = pl.DataFrame({'a':[3,3,3],'b':[6,6,9]})
>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 i64  i64 
╞═════╪═════╡
 3    6   
 3    6   
 3    9   
└─────┴─────┘

Now I would like to transform the data in the dataframe to be relative to the value in the first row, i.e. all values in column a would be divided by 3, all values in column b divided by 6 to receive

>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 f32  f32 
╞═════╪═════╡
 1.0  1.0 
 1.0  1.0 
 1.0  1.5 
└─────┴─────┘

What would be a good way to implement this using polars expressions?

I cannot find a divide method on polars Series and it is unclear to me how to get the first value in a column in the polars with_columns context.

答案1

得分: 1

找到一个解决方案:

>>> import polars as pl
>>> df = pl.DataFrame({'a':[3,3,3],'b':[6,6,9]})
>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 i64  i64 
╞═════╪═════╡
 3    6   
 3    6   
 3    9   
└─────┴─────┘
>>> df.with_columns(pl.all()/pl.all().head(1))
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 f64  f64 
╞═════╪═════╡
 1.0  1.0 
 1.0  1.0 
 1.0  1.5 
└─────┴─────┘

如果有更好的建议,欢迎提出。

英文:

Found a solution:

>>> import polars as pl
>>> df = pl.DataFrame({'a':[3,3,3],'b':[6,6,9]})
>>> df
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 i64  i64 
╞═════╪═════╡
 3    6   
 3    6   
 3    9   
└─────┴─────┘
>>> df.with_columns(pl.all()/pl.all().head(1))
shape: (3, 2)
┌─────┬─────┐
 a    b   
 ---  --- 
 f64  f64 
╞═════╪═════╡
 1.0  1.0 
 1.0  1.0 
 1.0  1.5 
└─────┴─────┘

In case someone has a better suggestion, it would be very much welcome.

huangapple
  • 本文由 发表于 2023年5月22日 22:26:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76307206.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定