如何在Polars Python中根据条件重命名列?

huangapple go评论76阅读模式
英文:

How to rename column on basis of condition in Polars python?

问题

我正在尝试根据在`Polars python`中的**条件****重命名列**但是出现了错误

**数据:**
```python
import polars as pl

test_df = pl.DataFrame({'Id': [100118647578,
  100023274028,100023274028,100023274028,100118647578,
  100118647578,100118647578,100023274028,100023274028,
  100023274028,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100118647578,
  100118647578,100118647578,100023274028,100118647578,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100023274028],

 'Age': [49,22,25,18,41,45,42,30,28,
  20,44,56,26,53,40,35,29,
  8,55,23,54,36,52,33,29,
  10,34,39,27,51,19,31],

 'Status': [2,1,1,1,1,1,1,3,2,1,1,
  1,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,
  1,1,1,1,4]})

下面的代码是为了根据参数的值过滤数据并在相同的基础上重命名:

def Age_filter(status_filter_value = 1):
    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')

        # 以下代码部分会出错
        .rename({'Age' : pl.when(status_filter_value == 1)
                            .then('30_DPD_MOB')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60_DPD_MOB')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90_DPD_MOB')
                                                  .otherwise('120_DPD_MOB')
                                                  )
                                        )
                })
    )

Age_filter()

这会产生一个错误: TypeError: argument 'new': 'Expr'对象无法转换为'PyString'

我也尝试过下面的代码,但也没有成功:

def Age_filter1(status_filter_value = 1):
    {
    renamed_value = pl.when(status_filter_value == 1)
                            .then('30')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90')
                                                  .otherwise('120')
                                                  )
                                        )


    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')
        .rename({'Age' : renamed_value
                })
    )
    }

Age_filter1()
英文:

I am trying to rename column on basis of a condition in Polars python but getting errors.

Data:

import polars as pl

test_df = pl.DataFrame({'Id': [100118647578,
  100023274028,100023274028,100023274028,100118647578,
  100118647578,100118647578,100023274028,100023274028,
  100023274028,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100118647578,
  100118647578,100118647578,100023274028,100118647578,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100023274028],

 'Age': [49,22,25,18,41,45,42,30,28,
  20,44,56,26,53,40,35,29,
  8,55,23,54,36,52,33,29,
  10,34,39,27,51,19,31],

 'Status': [2,1,1,1,1,1,1,3,2,1,1,
  1,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,
  1,1,1,1,4]})

Below code is to filter the data on basis of value from argument and rename on same basis:

def Age_filter(status_filter_value = 1):
    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')

        # below part of code is giving error
        .rename({'Age' : pl.when(status_filter_value == 1)
                            .then('30_DPD_MOB')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60_DPD_MOB')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90_DPD_MOB')
                                                  .otherwise('120_DPD_MOB')
                                                  )
                                        )
                })
    )

Age_filter()

this gives an error: TypeError: argument 'new': 'Expr' object cannot be converted to 'PyString'

I have also tried below code but that is also not working:

def Age_filter1(status_filter_value = 1):
    {
    renamed_value = pl.when(status_filter_value == 1)
                            .then('30')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90')
                                                  .otherwise('120')
                                                  )
                                        )


    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')
        .rename({'Age' : renamed_value
                })
    )
    }

Age_filter1()

答案1

得分: 1

正如错误所述,rename 方法仅接受一个字符串到字符串的字典。不需要复杂的表达式 - 实际上,pl.when 等也应该接受表达式,而不是一个静态整数值。

对于你的情况,你可以像这样以编程方式执行:

.rename({'Age': f'{status_filter_value*30}_DPD_MOB'})

编辑:或者,根据下面的评论,直接在 agg 中执行:

.agg(pl.col('Age').first().alias(f'{status_filter_value*30}_DPD_MOB'))
英文:

As the error states, the rename method takes a dict of string to string only. No complicated expressions needed - in fact, pl.when, etc. should also be taking expressions, not a static int value.

You can do something like this programmatically for your case:

.rename({'Age' : f'{status_filter_value*30}_DPD_MOB')

EDIT: Or, per below comments, directly in the agg:

.agg(pl.col('Age').first().alias(f'{status_filter_value*30}_DPD_MOB'))

huangapple
  • 本文由 发表于 2023年7月14日 04:19:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/76682983.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定