英文:
Add a column of string using list's indices from another column
问题
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
英文:
Having this list of name:
name_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']
As well as the following df
:
df = pd.DataFrame(
{
'idx': ['(3,)','(3, 5)','(1, 3, 5)',
'(1, 3, 5, 10)','(1, 3, 5, 8, 10)'],
'score': [0.773,0.841,0.862,0.874,0.883]
}
)
df.head(2)
idx score
0 (3,) 0.773
1 (3, 5) 0.841
The idx
column represents indices of the elements of name_list
. I want to add a new column name
to the df
with the corresponding name from the list.
Expected results:
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
答案1
得分: 1
这是你要的翻译结果:
需要执行几个步骤:
- 创建一个用于将索引映射到列表值的映射字典,
- 使用 [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval) 将元组的字符串表示转换为元组,
- 使用元组推导式将值进行映射
```python
from ast import literal_eval
d = dict(enumerate(name_list))
df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
for t in df['idx']]
如果您确信索引是有效的,无需使用字典:
df['name'] = [tuple(name_list[x] for x in literal_eval(t))
for t in df['idx']]
如果需要字符串输出:
df['name'] = [f"({' '.join(tuple(name_list[x] for x in literal_eval(t)))})"
for t in df['idx']]
输出:
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
<details>
<summary>英文:</summary>
You need a few steps:
- creating a mapping dictionary for the index -> value of the list,
- converting the strings representation of tuples to tuples with [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval),
- mapping the values with a tuple comprehension
from ast import literal_eval
d = dict(enumerate(name_list))
df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
for t in df['idx']]
If you are sure that the indices are valid, no need for the dictionary:
df['name'] = [tuple(name_list[x] for x in literal_eval(t))
for t in df['idx']]
For a string as output:
df['name'] = [f"({', '.join(tuple(name_list[x] for x in literal_eval(t)))})"
for t in df['idx']]
Output:
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
</details>
# 答案2
**得分**: 1
这是一种使用 `str.findall()` 和 `explode()` 的方法:
```python
df.assign(
name=(df['idx'].str.findall(r'\d+')
.explode()
.astype(int)
.map(dict(enumerate(name_list)))
.groupby(level=0).agg(tuple)))
输出结果:
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
英文:
Here is a way using str.findall()
and explode()
df.assign(
name = (df['idx'].str.findall(r'\d+')
.explode()
.astype(int)
.map(dict(enumerate(name_list)))
.groupby(level=0).agg(tuple)))
Output:
idx score name
0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论