英文:
Pandas read excel and flag rows with striketrough
问题
我需要使用Python在Excel表格上工作,以便稍后将其放入数据库。我的问题是,我有相当多的行/单元格有删除线。我希望正常读取Excel,并标记具有一个或多个删除线单元格的每一行。我找到了一种跳过这些行的解决方案,但我想要标记它。
我确实找到了使用openpyxl打印有删除线的行的方法,但我想使用pandas,因为我以后需要添加和删除许多列,而pandas在这方面表现最佳。
英文:
I have to work on an excel sheet with python so I can later put it in a database.
My problem is, that I have quite a lot of rows/cells with striketrough. I want to read the excel normally and flag every row with one or more striketrough cells. I just found a solution for skipping these rows but I want to flag it.
I did find a way to print the strikethrough rows using openpyxl but I wanted to use pandas because I have to add and cut a lot of columns later on and pandas is best for that.
答案1
得分: 0
这是一个示例,用于说明一般的逻辑。基本上,我们使用Font.strikethrough
来检查单元格的字体是否被划掉。
from openpyxl import load_workbook
ws = load_workbook("file.xlsx")["Sheet1"]
data = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
df = pd.DataFrame(data[1:], columns=data[0])
df["Flag"] = [
any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)
]
输出:
print(df)
col1 col2 col3 Flag
0 foo 1 test1 True
1 bar 2 test2 False
2 qux 3 test3 True
3 baz 4 test4 False
*使用的输入:*
[![进入图片描述][2]][2]
[1]: https://openpyxl.readthedocs.io/en/stable/api/openpyxl.styles.fonts.html?highlight=strike#openpyxl.styles.fonts.Font.strikethrough
[2]: https://i.stack.imgur.com/PMNgP.png
英文:
It's hard to imagine how looks like your spreadsheet but here is an example to give you the general logic. Basically, we use Font.strikethrough
from [tag:openpyxl] to check if a the font of a cell is striked or not.
from openpyxl import load_workbook
ws = load_workbook("file.xlsx")["Sheet1"]
data = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
df = pd.DataFrame(data[1:], columns=data[0])
df["Flag"] = [
any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)
]
Output :
print(df)
col1 col2 col3 Flag
0 foo 1 test1 True
1 bar 2 test2 False
2 qux 3 test3 True
3 baz 4 test4 False
Input used :
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论