问题

I have code that iterates through a large dataframe using df.itertuples()
我有一段代码，通过使用 df.itertuples() 遍历大型数据框。

I understand this is slow, and can be much faster if I convert to a dict and use df.to_dict('records)
我明白这样做速度较慢，如果我将其转换为字典并使用 df.to_dict('records')，速度会快得多。

The problem I am having is simply how to access those values.
我遇到的问题很简单，就是如何访问这些值。

I need to use enumerate (because I need the index value).
我需要使用 enumerate（因为我需要索引值）。

With itertuples, it is simple (to me) to access values with i.index or i.column_1
使用 itertuples，对我来说访问值很简单，可以使用 i.index 或 i.column_1

How do I do this in a dict after enumerate?
在使用 enumerate 后，我该如何在字典中实现这一点？

df = pd.DataFrame(data={'column_1': [1,2,3,4,5], 'column_2':[5,10,15,20,25], 'column_3':[3,6,9,12,15], 'column_4':[8,7,6,5,4]})
display(df)

### original code using itertuples
for i in df.itertuples():
  if i.Index % 2 == 0:
    print('column_1 value using itertuples',i.column_1)
  else:
    print('skip this row using itertuples')

### trying as dict
df_dict = df.to_dict('records')
for row in df_dict:
  for index, (key,value) in enumerate(row.items()):
    if index % 2 == 0:
      print('column_1 value',row['column_1']) ### this does not work
    else:
      print('skip this row')


<details>
<summary>英文:</summary>

I have code that iterates through a large dataframe using df.itertuples()
I understand this is slow, and can be much faster if I convert to a dict and use df.to_dict(&#39;records)

The problem I am having is simply how to access those values.
I need to use enumerate (because I need the index value).
With itertuples, it is simple (to me) to access values with i.index or i.column_1
How do I do this in a dict after enumerate?

df = pd.DataFrame(data={'column_1': [1,2,3,4,5], 'column_2':[5,10,15,20,25], 'column_3':[3,6,9,12,15], 'column_4':[8,7,6,5,4]})
display(df)

original code using itertuples

for i in df.itertuples():
if i.Index % 2 == 0:
print('column_1 value using itertuples',i.column_1)
else:
print('skip this row using itertuples')

trying as dict

df_dict = df.to_dict('records')
for row in df_dict:
for index, (key,value) in enumerate(row.items()):
if index % 2 == 0:
print('column_1 value',row['column_1']) ### this does not work
else:
print('skip this row')


</details>


# 答案1
**得分**: 1

你的内循环是不必要的，你只需要：

```python
for i, row in enumerate(df.to_dict('records')):
    if i % 2 == 0:
        print('column_1 value', row['column_1'])

尽管请注意，这不会给你数据框中的实际索引，enumerate 只会给你一个递增的计数器（它枚举可迭代对象中的项目）。

英文:

Your inner loop is not necessary, you just want:

&gt;&gt;&gt; for i, row in enumerate(df.to_dict(&#39;records&#39;)):
...     if i % 2 == 0:
...         print(&#39;column_1 value&#39;,row[&#39;column_1&#39;])
...
column_1 value 1
column_1 value 3
column_1 value 5

Although note, this won't give you the actual index in your data frame, enumerate just gives you an increasing counter (it enumerates the items in the iterable)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

通过枚举迭代字典时访问值

问题

original code using itertuples

trying as dict

如何读取一列并对每个单元格应用函数作为元组？

优化polars语句，通过在每一行上应用lambda函数添加一列。

xarray在乘法数据数组时的行为是什么？

如何向OpenSky（飞行跟踪器）发出请求

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论