英文:
How to rename column on basis of condition in Polars python?
问题
我正在尝试根据在`Polars python`中的**条件**来**重命名列**,但是出现了错误。
**数据:**
```python
import polars as pl
test_df = pl.DataFrame({'Id': [100118647578,
100023274028,100023274028,100023274028,100118647578,
100118647578,100118647578,100023274028,100023274028,
100023274028,100118647578,100118647578,100023274028,
100118647578,100118647578,100118647578,100118647578,
100118647578,100118647578,100023274028,100118647578,
100118647578,100118647578,100118647578,100023274028,
100118647578,100118647578,100118647578,100023274028,
100118647578,100118647578,100023274028],
'Age': [49,22,25,18,41,45,42,30,28,
20,44,56,26,53,40,35,29,
8,55,23,54,36,52,33,29,
10,34,39,27,51,19,31],
'Status': [2,1,1,1,1,1,1,3,2,1,1,
1,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,
1,1,1,1,4]})
下面的代码是为了根据参数的值过滤数据并在相同的基础上重命名:
def Age_filter(status_filter_value = 1):
return (
test_df
.filter(pl.col('Status') == status_filter_value)
.sort(['Id','Age'])
.groupby('Id')
.agg( pl.col('Age').first())
.sort('Id')
# 以下代码部分会出错
.rename({'Age' : pl.when(status_filter_value == 1)
.then('30_DPD_MOB')
.otherwise(pl.when(status_filter_value == 2)
.then('60_DPD_MOB')
.otherwise(pl.when(status_filter_value == 3)
.then('90_DPD_MOB')
.otherwise('120_DPD_MOB')
)
)
})
)
Age_filter()
这会产生一个错误: TypeError: argument 'new': 'Expr'对象无法转换为'PyString'
。
我也尝试过下面的代码,但也没有成功:
def Age_filter1(status_filter_value = 1):
{
renamed_value = pl.when(status_filter_value == 1)
.then('30')
.otherwise(pl.when(status_filter_value == 2)
.then('60')
.otherwise(pl.when(status_filter_value == 3)
.then('90')
.otherwise('120')
)
)
return (
test_df
.filter(pl.col('Status') == status_filter_value)
.sort(['Id','Age'])
.groupby('Id')
.agg( pl.col('Age').first())
.sort('Id')
.rename({'Age' : renamed_value
})
)
}
Age_filter1()
英文:
I am trying to rename column on basis of a condition in Polars python
but getting errors.
Data:
import polars as pl
test_df = pl.DataFrame({'Id': [100118647578,
100023274028,100023274028,100023274028,100118647578,
100118647578,100118647578,100023274028,100023274028,
100023274028,100118647578,100118647578,100023274028,
100118647578,100118647578,100118647578,100118647578,
100118647578,100118647578,100023274028,100118647578,
100118647578,100118647578,100118647578,100023274028,
100118647578,100118647578,100118647578,100023274028,
100118647578,100118647578,100023274028],
'Age': [49,22,25,18,41,45,42,30,28,
20,44,56,26,53,40,35,29,
8,55,23,54,36,52,33,29,
10,34,39,27,51,19,31],
'Status': [2,1,1,1,1,1,1,3,2,1,1,
1,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,
1,1,1,1,4]})
Below code is to filter the data on basis of value from argument and rename on same basis:
def Age_filter(status_filter_value = 1):
return (
test_df
.filter(pl.col('Status') == status_filter_value)
.sort(['Id','Age'])
.groupby('Id')
.agg( pl.col('Age').first())
.sort('Id')
# below part of code is giving error
.rename({'Age' : pl.when(status_filter_value == 1)
.then('30_DPD_MOB')
.otherwise(pl.when(status_filter_value == 2)
.then('60_DPD_MOB')
.otherwise(pl.when(status_filter_value == 3)
.then('90_DPD_MOB')
.otherwise('120_DPD_MOB')
)
)
})
)
Age_filter()
this gives an error: TypeError: argument 'new': 'Expr' object cannot be converted to 'PyString'
I have also tried below code but that is also not working:
def Age_filter1(status_filter_value = 1):
{
renamed_value = pl.when(status_filter_value == 1)
.then('30')
.otherwise(pl.when(status_filter_value == 2)
.then('60')
.otherwise(pl.when(status_filter_value == 3)
.then('90')
.otherwise('120')
)
)
return (
test_df
.filter(pl.col('Status') == status_filter_value)
.sort(['Id','Age'])
.groupby('Id')
.agg( pl.col('Age').first())
.sort('Id')
.rename({'Age' : renamed_value
})
)
}
Age_filter1()
答案1
得分: 1
正如错误所述,rename
方法仅接受一个字符串到字符串的字典。不需要复杂的表达式 - 实际上,pl.when
等也应该接受表达式,而不是一个静态整数值。
对于你的情况,你可以像这样以编程方式执行:
.rename({'Age': f'{status_filter_value*30}_DPD_MOB'})
编辑:或者,根据下面的评论,直接在 agg
中执行:
.agg(pl.col('Age').first().alias(f'{status_filter_value*30}_DPD_MOB'))
英文:
As the error states, the rename
method takes a dict of string to string only. No complicated expressions needed - in fact, pl.when
, etc. should also be taking expressions, not a static int value.
You can do something like this programmatically for your case:
.rename({'Age' : f'{status_filter_value*30}_DPD_MOB')
EDIT: Or, per below comments, directly in the agg
:
.agg(pl.col('Age').first().alias(f'{status_filter_value*30}_DPD_MOB'))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论