2023年7月13日 22:49:51go评论109阅读模式

英文:

IndexError: list index out of range at table_data[headers[len(table_data)]] = values

问题

我有这个Python脚本，它读取一个HTML文件然后从中创建一个数据库表，但是我在第`table_data[headers[len(table_data)]] = values`行遇到了`IndexError: list index out of range`错误。我在Python方面的经验不多，但我尝试过在for循环和`len(table_data)`中尝试一些东西。
从bs4模块导入BeautifulSoup
# 指定您的HTML文件的路径
html_file_path = 'C:/Users/tom/service/soc/Soc.html'
# 读取HTML文件的内容
with open(html_file_path, 'r') as file:
    html = file.read()
# 解析HTML
soup = BeautifulSoup(html, 'html.parser')
# 在HTML中查找所有表格
tables = soup.find_all('table')
# 遍历表格
for table in tables:
    # 找到表格的ID属性
    table_id = table.get('id')
    # 提取表头
    headers = [th.get_text() for th in table.find('thead').find_all('th')]
    # 创建一个字典来存储表格数据
    table_data = {}
    # 遍历表格行
    for row in table.find('tbody').find_all('tr'):
        # 提取行单元格
        cells = row.find_all('td')
        # 提取单元格的值
        values = [cell.get_text().strip() for cell in cells]
        # 将值与其对应的表头存储在字典中
        table_data[headers[len(table_data)]] = values
    # 生成PostgreSQL表格脚本
    create_table_script = f"CREATE TABLE {table_id} (/n"
    for header, values in table_data.items():
        create_table_script += f"    {header} {', '.join(values)},/n"
    create_table_script = create_table_script.rstrip(',/n') + "/n);/n"
    # 打印表格脚本
    print(create_table_script)

英文:

I have this python script that reads a html file then creates a database table from that but I'm getting a IndexError: list index out of range error on line table_data[headers[len(table_data)]] = values I haven't worked much with python a lot but I've tried stuff with the for loop and the len(table_data)

from bs4 import BeautifulSoup
# Specify the path to your HTML file
html_file_path = &#39;C:/Users/tom/service/soc/Soc.html&#39;
# Read the contents of the HTML file
with open(html_file_path, &#39;r&#39;) as file:
    html = file.read()
# Parse the HTML
soup = BeautifulSoup(html, &#39;html.parser&#39;)
# Find all the tables in the HTML
tables = soup.find_all(&#39;table&#39;)
# Iterate over the tables
for table in tables:
    # Find the table&#39;s ID attribute
    table_id = table.get(&#39;id&#39;)
    # Extract the table headers
    headers = [th.get_text() for th in table.find(&#39;thead&#39;).find_all(&#39;th&#39;)]
    # Create a dictionary to store the table data
    table_data = {}
    # Iterate over the table rows
    for row in table.find(&#39;tbody&#39;).find_all(&#39;tr&#39;):
        # Extract the row cells
        cells = row.find_all(&#39;td&#39;)
        # Extract the cell values
        values = [cell.get_text().strip() for cell in cells]
        # Store the values with their corresponding headers in the dictionary
        table_data[headers[len(table_data)]] = values
    # Generate the PostgreSQL table script
    create_table_script = f&quot;CREATE TABLE {table_id} (/n&quot;
    for header, values in table_data.items():
        create_table_script += f&quot;    {header} {&#39;, &#39;.join(values)},/n&quot;
    create_table_script = create_table_script.rstrip(&#39;,/n&#39;) + &quot;/n);/n&quot;
    # Print the table script
    print(create_table_script)

答案1

得分: 1

你没有提供完整的错误消息，但我认为我看到了问题。我将假设table_data和headers是两个相关列表，长度相同。请记住，列表索引从零开始，因此len(lst)返回的是最后一个索引之上的数字。

举个例子：

lst = [1, 2, 3]
print(lst[len(lst)])  # IndexError: list index out of range

假设headers和table_data的长度相同，那么len(table_data)返回的数字超出了这两个列表的范围。如果你想获取列表中的最后一项，只需使用lst[-1]。

英文:

You didn't provide a full error message, but I think I see the problem. I'll make the assumption that table_data and headers are two related list of the same length. Remember that list indexes start with zero, therefore len(lst) returns the number above the number of the last index.

To illustrate:

lst = [1, 2, 3]
print(lst[len(lst)])  # IndexError: list index out of range

Assuming headers and table_data are of the same length, len(table_data) returns a number out of the range of both lists. If you were trying to get the last item in the list, just use lst[-1].

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

IndexError: 列表索引超出范围，位置在 table_data[headers[len(table_data)]] = values

问题

答案1

Pyspark JDBC 返回带有列名的所有行

你可以在 Python Azure 函数中如何使用 ZBar 库？

“突破策略的交易模型”

sheets to docs API请求访问问题（通过方法：files.copy）

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。