英文:
IndexError: list index out of range at table_data[headers[len(table_data)]] = values
问题
我有这个Python脚本,它读取一个HTML文件然后从中创建一个数据库表,但是我在第`table_data[headers[len(table_data)]] = values`行遇到了`IndexError: list index out of range`错误。我在Python方面的经验不多,但我尝试过在for循环和`len(table_data)`中尝试一些东西。
从bs4模块导入BeautifulSoup
# 指定您的HTML文件的路径
html_file_path = 'C:/Users/tom/service/soc/Soc.html'
# 读取HTML文件的内容
with open(html_file_path, 'r') as file:
html = file.read()
# 解析HTML
soup = BeautifulSoup(html, 'html.parser')
# 在HTML中查找所有表格
tables = soup.find_all('table')
# 遍历表格
for table in tables:
# 找到表格的ID属性
table_id = table.get('id')
# 提取表头
headers = [th.get_text() for th in table.find('thead').find_all('th')]
# 创建一个字典来存储表格数据
table_data = {}
# 遍历表格行
for row in table.find('tbody').find_all('tr'):
# 提取行单元格
cells = row.find_all('td')
# 提取单元格的值
values = [cell.get_text().strip() for cell in cells]
# 将值与其对应的表头存储在字典中
table_data[headers[len(table_data)]] = values
# 生成PostgreSQL表格脚本
create_table_script = f"CREATE TABLE {table_id} (/n"
for header, values in table_data.items():
create_table_script += f" {header} {', '.join(values)},/n"
create_table_script = create_table_script.rstrip(',/n') + "/n);/n"
# 打印表格脚本
print(create_table_script)
英文:
I have this python script that reads a html file then creates a database table from that but I'm getting a IndexError: list index out of range
error on line table_data[headers[len(table_data)]] = values
I haven't worked much with python a lot but I've tried stuff with the for loop and the len(table_data)
from bs4 import BeautifulSoup
# Specify the path to your HTML file
html_file_path = 'C:/Users/tom/service/soc/Soc.html'
# Read the contents of the HTML file
with open(html_file_path, 'r') as file:
html = file.read()
# Parse the HTML
soup = BeautifulSoup(html, 'html.parser')
# Find all the tables in the HTML
tables = soup.find_all('table')
# Iterate over the tables
for table in tables:
# Find the table's ID attribute
table_id = table.get('id')
# Extract the table headers
headers = [th.get_text() for th in table.find('thead').find_all('th')]
# Create a dictionary to store the table data
table_data = {}
# Iterate over the table rows
for row in table.find('tbody').find_all('tr'):
# Extract the row cells
cells = row.find_all('td')
# Extract the cell values
values = [cell.get_text().strip() for cell in cells]
# Store the values with their corresponding headers in the dictionary
table_data[headers[len(table_data)]] = values
# Generate the PostgreSQL table script
create_table_script = f"CREATE TABLE {table_id} (/n"
for header, values in table_data.items():
create_table_script += f" {header} {', '.join(values)},/n"
create_table_script = create_table_script.rstrip(',/n') + "/n);/n"
# Print the table script
print(create_table_script)
答案1
得分: 1
你没有提供完整的错误消息,但我认为我看到了问题。我将假设table_data
和headers
是两个相关列表,长度相同。请记住,列表索引从零开始,因此len(lst)
返回的是最后一个索引之上的数字。
举个例子:
lst = [1, 2, 3]
print(lst[len(lst)]) # IndexError: list index out of range
假设headers
和table_data
的长度相同,那么len(table_data)
返回的数字超出了这两个列表的范围。如果你想获取列表中的最后一项,只需使用lst[-1]
。
英文:
You didn't provide a full error message, but I think I see the problem. I'll make the assumption that table_data
and headers
are two related list of the same length. Remember that list indexes start with zero, therefore len(lst)
returns the number above the number of the last index.
To illustrate:
lst = [1, 2, 3]
print(lst[len(lst)]) # IndexError: list index out of range
Assuming headers
and table_data
are of the same length, len(table_data)
returns a number out of the range of both lists. If you were trying to get the last item in the list, just use lst[-1]
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论