2023年6月19日 22:59:09go评论108阅读模式

英文:

Add a column of string using list's indices from another column

问题

 	            idx 	score 	          name
0 	           (3,) 	0.773 	          (D,)
1 	          (3, 5) 	0.841 	        (D, F)
2 	       (1, 3, 5) 	0.862        (B, D, F)
3 	   (1, 3, 5, 10) 	0.874     (B, D, F, K)
4 	(1, 3, 5, 8, 10) 	0.883  (B, D, F, I, K)

英文:

Having this list of name:

name_list = [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;, &#39;E&#39;, &#39;F&#39;, &#39;G&#39;, &#39;H&#39;, &#39;I&#39;, &#39;J&#39;, &#39;K&#39;]

As well as the following df:

df = pd.DataFrame(
    {
     &#39;idx&#39;: [&#39;(3,)&#39;,&#39;(3, 5)&#39;,&#39;(1, 3, 5)&#39;,
  &#39;(1, 3, 5, 10)&#39;,&#39;(1, 3, 5, 8, 10)&#39;],
 &#39;score&#39;: [0.773,0.841,0.862,0.874,0.883]
    }
)
df.head(2)
 	idx 	score
0 	(3,) 	0.773
1 	(3, 5) 	0.841

The idx column represents indices of the elements of name_list. I want to add a new column name to the df with the corresponding name from the list.

Expected results:

 	            idx 	score 	          name
0 	           (3,) 	0.773 	          (D,)
1 	          (3, 5) 	0.841 	        (D, F)
2 	       (1, 3, 5) 	0.862        (B, D, F)
3 	   (1, 3, 5, 10) 	0.874     (B, D, F, K)
4 	(1, 3, 5, 8, 10) 	0.883  (B, D, F, I, K)

答案1

得分: 1

这是你要的翻译结果：

需要执行几个步骤：
- 创建一个用于将索引映射到列表值的映射字典，
- 使用 [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval) 将元组的字符串表示转换为元组，
- 使用元组推导式将值进行映射
```python
from ast import literal_eval
d = dict(enumerate(name_list))
df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
              for t in df['idx']]

如果您确信索引是有效的，无需使用字典：

df['name'] = [tuple(name_list[x] for x in literal_eval(t))
              for t in df['idx']]

如果需要字符串输出：

df['name'] = [f"({' '.join(tuple(name_list[x] for x in literal_eval(t)))})"
              for t in df['idx']]

输出：

                idx  score             name
0              (3,)  0.773             (D,)
1            (3, 5)  0.841           (D, F)
2         (1, 3, 5)  0.862        (B, D, F)
3     (1, 3, 5, 10)  0.874     (B, D, F, K)
4  (1, 3, 5, 8, 10)  0.883  (B, D, F, I, K)


<details>
<summary>英文:</summary>
You need a few steps:
 - creating a mapping dictionary for the index -&gt; value of the list,
 - converting the strings representation of tuples to tuples with [`ast.literal_eval`](https://docs.python.org/3/library/ast.html#ast.literal_eval),
 - mapping the values with a tuple comprehension

from ast import literal_eval

d = dict(enumerate(name_list))

df['name'] = [tuple(d.get(x, '?') for x in literal_eval(t))
for t in df['idx']]

If you are sure that the indices are valid, no need for the dictionary:

df['name'] = [tuple(name_list[x] for x in literal_eval(t))
for t in df['idx']]

For a string as output:

df['name'] = [f"({', '.join(tuple(name_list[x] for x in literal_eval(t)))})"
for t in df['idx']]

Output:

            idx  score             name

0 (3,) 0.773 (D,)
1 (3, 5) 0.841 (D, F)
2 (1, 3, 5) 0.862 (B, D, F)
3 (1, 3, 5, 10) 0.874 (B, D, F, K)
4 (1, 3, 5, 8, 10) 0.883 (B, D, F, I, K)


</details>
# 答案2
**得分**: 1
这是一种使用 `str.findall()` 和 `explode()` 的方法：
```python
df.assign(
    name=(df['idx'].str.findall(r'\d+')
          .explode()
          .astype(int)
          .map(dict(enumerate(name_list)))
          .groupby(level=0).agg(tuple)))

输出结果：

                   idx  score             name
0              (3,)  0.773             (D,)
1            (3, 5)  0.841           (D, F)
2         (1, 3, 5)  0.862        (B, D, F)
3     (1, 3, 5, 10)  0.874     (B, D, F, K)
4  (1, 3, 5, 8, 10)  0.883  (B, D, F, I, K)

英文:

Here is a way using str.findall() and explode()

df.assign(
    name = (df[&#39;idx&#39;].str.findall(r&#39;\d+&#39;)
            .explode()
            .astype(int)
            .map(dict(enumerate(name_list)))
            .groupby(level=0).agg(tuple)))

Output:

                idx  score             name
0              (3,)  0.773             (D,)
1            (3, 5)  0.841           (D, F)
2         (1, 3, 5)  0.862        (B, D, F)
3     (1, 3, 5, 10)  0.874     (B, D, F, K)
4  (1, 3, 5, 8, 10)  0.883  (B, D, F, I, K)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

添加一个字符串列，使用另一列的索引。

问题

答案1

从在线托管的特定PDF中提取数据

如何在FastAPI中的中间件中更新/修改请求头和查询参数？

Python字典在将它们用作函数参数时

有一种方法可以找到在切换到另一个索引值之前的每个最大值吗？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。