英文:
List index out of range error when iterating Pandas DataFrame
问题
我正在使用for循环遍历pandas数据框架,并将值附加到单独的列表中,然后将其转换回另一个pandas数据框架。然而,在尝试调用在同一迭代中先前附加到列表的值时,我收到了索引超出范围的错误。注意:所有列表(date、dow、rt等)在此代码之前都已初始化。不想让帖子太长。
这是我的代码:
# 遍历df以附加到已初始化的列表
# list.append:O(log^2(n)) vs df.append:O(n^2)
# 先附加到列表,然后使用已完成的列表创建df2
for i, row in df.iterrows():
date.append(str(df.at[i, 'Route Date'])[0:10])
dow.append(str(df.at[i, 'Route Number'])[0])
rt.append(str(df.at[i, 'Route Number'])[1:4])
# 附加路线类型和自治市信息使用查找表
# 查找表将与PyInstaller一起打包
for j, row in lookup.iterrows():
if df.at[i, 'Route Number'] == lookup.at[j, 'Route']:
rt_type.append(str(lookup.at[j, 'Route Type']))
muni.append(str(lookup.at[j, 'Municipality']))
miles.append(df.at[i, 'Miles'])
disp_tons.append(df.at[i, 'Disposal Tons'])
disp_loads.append(df.at[i, 'Disposal Loads'])
stops.append(df.at[i, 'Stops'])
clk_hrs.append(df.at[i, 'Clock Hours'])
travel.append((miles[i]) / 22)
# 服务时间因卡车类型和自治市而异
if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
service.append((stops[i]*17)/3600)
disp.append((disp_loads[i])*(22/60))
pre_post.append("0.78")
target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
variance.append(target_clk_hrs[i] - clk_hrs[i])
continue
索引错误发生在target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
处,当调用service[i]
的值时。当不使用if语句而是使用service.append((stops[i]*17)/3600)
运行时,不会出现索引错误。我对为什么A.没有if语句就能正常工作以及B.为什么travel
没有出现索引错误而service
出现了索引错误感到困惑。
我假设问题出在迭代器for j, row in lookup.iterrows()
上。请注意,lookup
是一个单独的查找表,用于有条件地返回一些列表的值。我尝试在此循环上使用break语句,但仍然收到相同的错误。
我另一个想法是,if块中的rt_type[i]
和muni[i]
未正确索引,因此未附加service
,但我无法找到此问题的解决方法。
我还参考了此帖子,但没有成功解决问题。
英文:
I am using a for loop to iterate through a pandas dataframe and append values to separate lists to then convert back to another pandas df. However, I am receiving an index out of range error when trying to call a value that has been previously appended to a list in the same iteration. NOTE: all lists (date, dow, rt, etc.) have been initialized prior to this code. Didn't want to make the post too long.
Here's my code:
#Iterrate through df to append to initialized lists
#list.append: O(log^2(n)) vs df.append: O(n^2)
#Append to lists then create df2 with completed lists
for i, row in df.iterrows():
date.append(str(df.at[i, 'Route Date'])[0:10])
dow.append(str(df.at[i, 'Route Number'])[0])
rt.append(str(df.at[i, 'Route Number'])[1:4])
#Append route type and municipality using lookup table
#Lookup table will be packaged with PyInstaller
for j, row in lookup.iterrows():
if df.at[i, 'Route Number'] == lookup.at[j, 'Route']:
rt_type.append(str(lookup.at[j, 'Route Type']))
muni.append(str(lookup.at[j, 'Municipality']))
miles.append(df.at[i, 'Miles'])
disp_tons.append(df.at[i, 'Disposal Tons'])
disp_loads.append(df.at[i, 'Disposal Loads'])
stops.append(df.at[i, 'Stops'])
clk_hrs.append(df.at[i, 'Clock Hours'])
travel.append((miles[i]) / 22)
#Service time varies by truck type and municipality
if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
service.append((stops[i]*17)/3600)
disp.append((disp_loads[i])*(22/60))
pre_post.append("0.78")
target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
variance.append(target_clk_hrs[i] - clk_hrs[i])
continue
The index error is occurring at target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
when calling the value of service[i]
. When running this without the if statement: if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
and instead using only service.append((stops[i]*17)/3600)
, I run into no indexing errors. I am confused as to why A. this works without the if statement, and B. why travel
doesn't run into an index error if service
does.
I'm assuming the issue lies with the iterator for j, row in lookup.iterrows()
. Note that lookup
is a separate lookup table used to return values to some of the lists conditionally. I've tried using a break statement on this loop and am still getting the same error.
My other thought was that rt_type[i]
and muni[i]
in the if block are not being indexed correctly and therefore service
is not being appended, but I haven't been able to come up with a fix for this.
I've also consulted this post to no avail.
答案1
得分: 1
在迭代的第i
步中,travel[i]
是travel
的最后一个元素,因为您的代码确保在此之前插入了恰好i次到travel
中。
然而,对service
的插入是有条件的:
if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
service.append((stops[i]*17)/3600)
当条件不满足时,append
操作不会发生,列表的潜在长度会减少1。
在第一次 append
操作不发生之后,service
的长度为 i - 1
,所以 service[i]
会引发索引错误。
附注:没有提供输入和输出示例,所以很难确定,但可能您的任务可以以更多 pandas 的方式解决,避免使用所有这些列表。
英文:
At i
-th step of the iteration travel[i]
is the last element of travel
, because your code guarantees exactly i insertions into travel
up to the point.
However, insertions into service
are conditional:
if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
service.append((stops[i]*17)/3600)
When the condition is not satisfied, append doesn't happen and the potential length of the list shortens by 1.
Right after the first time the append doesn't happen service
has length i - 1
, so service[i]
causes index error.
P.S. there're no examples of input and output, so it's hard to say, but probably your task could be solved in more pandas way, avoiding all the lists.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论