索引列表超出范围错误在迭代Pandas DataFrame时发生。

huangapple go评论69阅读模式
英文:

List index out of range error when iterating Pandas DataFrame

问题

我正在使用for循环遍历pandas数据框架,并将值附加到单独的列表中,然后将其转换回另一个pandas数据框架。然而,在尝试调用在同一迭代中先前附加到列表的值时,我收到了索引超出范围的错误。注意:所有列表(date、dow、rt等)在此代码之前都已初始化。不想让帖子太长。

这是我的代码:

# 遍历df以附加到已初始化的列表
# list.append:O(log^2(n)) vs df.append:O(n^2)
# 先附加到列表,然后使用已完成的列表创建df2
for i, row in df.iterrows():
        date.append(str(df.at[i, 'Route Date'])[0:10])
        dow.append(str(df.at[i, 'Route Number'])[0])
        rt.append(str(df.at[i, 'Route Number'])[1:4])
        # 附加路线类型和自治市信息使用查找表
        # 查找表将与PyInstaller一起打包
        for j, row in lookup.iterrows():
            if df.at[i, 'Route Number'] == lookup.at[j, 'Route']:
                rt_type.append(str(lookup.at[j, 'Route Type']))
                muni.append(str(lookup.at[j, 'Municipality']))
        miles.append(df.at[i, 'Miles'])
        disp_tons.append(df.at[i, 'Disposal Tons'])
        disp_loads.append(df.at[i, 'Disposal Loads'])
        stops.append(df.at[i, 'Stops'])
        clk_hrs.append(df.at[i, 'Clock Hours'])
        travel.append((miles[i]) / 22)
        # 服务时间因卡车类型和自治市而异
        if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
            service.append((stops[i]*17)/3600)
        disp.append((disp_loads[i])*(22/60))
        pre_post.append("0.78")
        target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
        variance.append(target_clk_hrs[i] - clk_hrs[i])
        continue

索引错误发生在target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)处,当调用service[i]的值时。当不使用if语句而是使用service.append((stops[i]*17)/3600)运行时,不会出现索引错误。我对为什么A.没有if语句就能正常工作以及B.为什么travel没有出现索引错误而service出现了索引错误感到困惑。

我假设问题出在迭代器for j, row in lookup.iterrows()上。请注意,lookup是一个单独的查找表,用于有条件地返回一些列表的值。我尝试在此循环上使用break语句,但仍然收到相同的错误。

我另一个想法是,if块中的rt_type[i]muni[i]未正确索引,因此未附加service,但我无法找到此问题的解决方法。

我还参考了此帖子,但没有成功解决问题。

英文:

I am using a for loop to iterate through a pandas dataframe and append values to separate lists to then convert back to another pandas df. However, I am receiving an index out of range error when trying to call a value that has been previously appended to a list in the same iteration. NOTE: all lists (date, dow, rt, etc.) have been initialized prior to this code. Didn't want to make the post too long.

Here's my code:

#Iterrate through df to append to initialized lists
#list.append: O(log^2(n)) vs df.append: O(n^2)
#Append to lists then create df2 with completed lists
for i, row in df.iterrows():
        date.append(str(df.at[i, 'Route Date'])[0:10])
        dow.append(str(df.at[i, 'Route Number'])[0])
        rt.append(str(df.at[i, 'Route Number'])[1:4])
        #Append route type and municipality using lookup table
        #Lookup table will be packaged with PyInstaller
        for j, row in lookup.iterrows():
            if df.at[i, 'Route Number'] == lookup.at[j, 'Route']:
                rt_type.append(str(lookup.at[j, 'Route Type']))
                muni.append(str(lookup.at[j, 'Municipality']))
        miles.append(df.at[i, 'Miles'])
        disp_tons.append(df.at[i, 'Disposal Tons'])
        disp_loads.append(df.at[i, 'Disposal Loads'])
        stops.append(df.at[i, 'Stops'])
        clk_hrs.append(df.at[i, 'Clock Hours'])
        travel.append((miles[i]) / 22)
        #Service time varies by truck type and municipality
        if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
            service.append((stops[i]*17)/3600)
        disp.append((disp_loads[i])*(22/60))
        pre_post.append("0.78")
        target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28)
        variance.append(target_clk_hrs[i] - clk_hrs[i])
        continue

The index error is occurring at target_clk_hrs.append(travel[i] + service[i] + disp[i] + 1.28) when calling the value of service[i]. When running this without the if statement: if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills': and instead using only service.append((stops[i]*17)/3600), I run into no indexing errors. I am confused as to why A. this works without the if statement, and B. why travel doesn't run into an index error if service does.

I'm assuming the issue lies with the iterator for j, row in lookup.iterrows(). Note that lookup is a separate lookup table used to return values to some of the lists conditionally. I've tried using a break statement on this loop and am still getting the same error.

My other thought was that rt_type[i] and muni[i] in the if block are not being indexed correctly and therefore service is not being appended, but I haven't been able to come up with a fix for this.

I've also consulted this post to no avail.

答案1

得分: 1

在迭代的第i步中,travel[i]travel的最后一个元素,因为您的代码确保在此之前插入了恰好i次到travel中。

然而,对service的插入是有条件的:

if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
    service.append((stops[i]*17)/3600)

当条件不满足时,append 操作不会发生,列表的潜在长度会减少1。

在第一次 append 操作不发生之后,service 的长度为 i - 1,所以 service[i] 会引发索引错误。

附注:没有提供输入和输出示例,所以很难确定,但可能您的任务可以以更多 pandas 的方式解决,避免使用所有这些列表。

英文:

At i-th step of the iteration travel[i] is the last element of travel, because your code guarantees exactly i insertions into travel up to the point.

However, insertions into service are conditional:

if rt_type[i] == 'AFEL' and muni[i] == 'Vestavia Hills':
    service.append((stops[i]*17)/3600)

When the condition is not satisfied, append doesn't happen and the potential length of the list shortens by 1.

Right after the first time the append doesn't happen service has length i - 1, so service[i] causes index error.

P.S. there're no examples of input and output, so it's hard to say, but probably your task could be solved in more pandas way, avoiding all the lists.

huangapple
  • 本文由 发表于 2023年7月7日 01:25:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76631223.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定