添加一个字符串列,使用另一列的索引。

huangapple go评论108阅读模式
英文:

Add a column of string using list's indices from another column

问题

  1. idx score name
  2. 0 (3,) 0.773 (D,)
  3. 1 (3, 5) 0.841 (D, F)
  4. 2 (1, 3, 5) 0.862 (B, D, F)
  5. 3 (1, 3, 5, 10) 0.874 (B, D, F, K)
  6. 4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
英文:

Having this list of name:

  1. name_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']

As well as the following df:

  1. df = pd.DataFrame(
  2. {
  3. 'idx': ['(3,)','(3, 5)','(1, 3, 5)',
  4. '(1, 3, 5, 10)','(1, 3, 5, 8, 10)'],
  5. 'score': [0.773,0.841,0.862,0.874,0.883]
  6. }
  7. )
  8. df.head(2)
  9. idx score
  10. 0 (3,) 0.773
  11. 1 (3, 5) 0.841

The idx column represents indices of the elements of name_list. I want to add a new column name to the df with the corresponding name from the list.

Expected results:

  1. idx score name
  2. 0 (3,) 0.773 (D,)
  3. 1 (3, 5) 0.841 (D, F)
  4. 2 (1, 3, 5) 0.862 (B, D, F)
  5. 3 (1, 3, 5, 10) 0.874 (B, D, F, K)
  6. 4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)

答案1

得分: 1

这是你要的翻译结果:

  1. 需要执行几个步骤
  2. - 创建一个用于将索引映射到列表值的映射字典
  3. - 使用 [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval) 将元组的字符串表示转换为元组,
  4. - 使用元组推导式将值进行映射
  5. ```python
  6. from ast import literal_eval
  7. d = dict(enumerate(name_list))
  8. df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
  9. for t in df['idx']]

如果您确信索引是有效的,无需使用字典:

  1. df['name'] = [tuple(name_list[x] for x in literal_eval(t))
  2. for t in df['idx']]

如果需要字符串输出:

  1. df['name'] = [f"({' '.join(tuple(name_list[x] for x in literal_eval(t)))})"
  2. for t in df['idx']]

输出:

  1. idx score name
  2. 0 (3,) 0.773 (D,)
  3. 1 (3, 5) 0.841 (D, F)
  4. 2 (1, 3, 5) 0.862 (B, D, F)
  5. 3 (1, 3, 5, 10) 0.874 (B, D, F, K)
  6. 4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
  1. <details>
  2. <summary>英文:</summary>
  3. You need a few steps:
  4. - creating a mapping dictionary for the index -&gt; value of the list,
  5. - converting the strings representation of tuples to tuples with [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval),
  6. - mapping the values with a tuple comprehension

from ast import literal_eval

d = dict(enumerate(name_list))

df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
for t in df['idx']]

  1. If you are sure that the indices are valid, no need for the dictionary:

df['name'] = [tuple(name_list[x] for x in literal_eval(t))
for t in df['idx']]

  1. For a string as output:

df['name'] = [f"({', '.join(tuple(name_list[x] for x in literal_eval(t)))})"
for t in df['idx']]

  1. Output:
  1. idx score name

0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)

  1. </details>
  2. # 答案2
  3. **得分**: 1
  4. 这是一种使用 `str.findall()` 和 `explode()` 的方法:
  5. ```python
  6. df.assign(
  7. name=(df['idx'].str.findall(r'\d+')
  8. .explode()
  9. .astype(int)
  10. .map(dict(enumerate(name_list)))
  11. .groupby(level=0).agg(tuple)))

输出结果:

  1. idx score name
  2. 0 (3,) 0.773 (D,)
  3. 1 (3, 5) 0.841 (D, F)
  4. 2 (1, 3, 5) 0.862 (B, D, F)
  5. 3 (1, 3, 5, 10) 0.874 (B, D, F, K)
  6. 4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)
英文:

Here is a way using str.findall() and explode()

  1. df.assign(
  2. name = (df[&#39;idx&#39;].str.findall(r&#39;\d+&#39;)
  3. .explode()
  4. .astype(int)
  5. .map(dict(enumerate(name_list)))
  6. .groupby(level=0).agg(tuple)))

Output:

  1. idx score name
  2. 0 (3,) 0.773 (D,)
  3. 1 (3, 5) 0.841 (D, F)
  4. 2 (1, 3, 5) 0.862 (B, D, F)
  5. 3 (1, 3, 5, 10) 0.874 (B, D, F, K)
  6. 4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)

huangapple
  • 本文由 发表于 2023年6月19日 22:59:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76507853.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定