2023年6月2日 02:28:37go评论112阅读模式

英文:

When reading an excel file in Python can we know which column/field is filtered

问题

我想在通过Python读取Excel文件时捕获被筛选的字段或列名。我看到我们也可以使用openpyxl仅捕获未隐藏的行，并使用hidden == False（https://stackoverflow.com/questions/46002159/how-to-import-filtered-excel-table-into-python）。在我的项目中，识别Excel文件中被筛选的字段/列非常重要。这是否可能？如何实现？

添加一个示例。

pip install openpyxl
from openpyxl import load_workbook

wb = load_workbook('test_filter_column.xlsx')
ws = wb['data']

这是未隐藏的数据

而如果性别列如下所示被筛选

所以我期望的输出应该是性别，因为它被筛选了。如果有多个字段被筛选，则期望提供所有被筛选的列名。

英文:

I want to capture the field or column name that is filtered in the excel file when reading through python. I saw that we can also capture only the filtered rows by using openpyxl and using hidden == False (https://stackoverflow.com/questions/46002159/how-to-import-filtered-excel-table-into-python). In my project it is important to identify which field/column is filtered in the excel file. Is it possible? and how to achieve?
Adding an example.

pip install openpyxl
from openpyxl import load_workbook

 wb = load_workbook(&#39;test_filter_column.xlsx&#39;)
 ws = wb[&#39;data&#39;]

This is non hidden data
while if the gender column is filtered below

So what I am expecting is my output should be giving gender as that is filtered. If more than one field is filtered then expecting to provide all the filtered column names.

答案1

得分: 5

受此帖子启发，但适应您的情况，并且考虑到存在多个筛选条件时：


from openpyxl import load_workbook
from openpyxl.utils import get_column_letter

# 加载工作簿
wb = load_workbook('/path/to/xlsx')
# 提取工作表
ws = wb['data']

# 创建一个字典以存储相关信息
filters = {}

# 获取被筛选的列的ID
filters['col_id'] = [col.col_id for col in ws.auto_filter.filterColumn]

# 获取被筛选的列的字母
# 这似乎是从1开始索引的，但get_column_letter
# 是从0开始索引的
filters['col_letter'] = [get_column_letter(col + 1) for col in filters['col_id']]

# 提取列名 - 假设它在列的第1行中给出
filters['col_name'] = [ws[f'{col}1'].value for col in filters['col_letter']]

# 获取正在使用的筛选器的值
filters['filter_values'] = [col.filters.filter for col in ws.auto_filter.filterColumn]

print(filters)

输出：

{'col_id': [3], 'col_letter': ['D'], 'col_name': ['gender'], 'filter_values': [['F']]}

我认为这涵盖了您提供的示例，并希望还显示了您可能想要的其他信息。

英文:

Inspired by this post, but adapted to your case, and also accounting for when there are multiple filters:


from openpyxl import load_workbook
from openpyxl.utils import get_column_letter

# Load workbook
wb = load_workbook(&#39;/path/to/xlsx&#39;)
# Extract sheet
ws = wb[&#39;data&#39;]

# Create a dict to store relevant info
filters = {}

# Get the ID of the columns that are filtered
filters[&#39;col_id&#39;] = [col.col_id for col in ws.auto_filter.filterColumn]

# Get the letter of the columns that are filtered
# This appears to be one-indexed, but get_column_letter
# is zero indexed
filters[&#39;col_letter&#39;] = [get_column_letter(col + 1) for col in filters[&#39;col_id&#39;]]

# Extract the column name - assuming that it is
# given in row 1 of the column
filters[&#39;col_name&#39;] = [ws[f&#39;{col}1&#39;].value for col in filters[&#39;col_letter&#39;]]

# Get the values of the filters being used
filters[&#39;filter_values&#39;] = [col.filters.filter for col in ws.auto_filter.filterColumn]

print(filters)

Output:

{&#39;col_id&#39;: [3], &#39;col_letter&#39;: [&#39;D&#39;], &#39;col_name&#39;: [&#39;gender&#39;], &#39;filter_values&#39;: [[&#39;F&#39;]]}

I think this covers the examples you gave, and hopefully shows some other info you might want too.

答案2

得分: 0

from openpyxl import load_workbook
wb = load_workbook('hello.xlsx')
ws = wb['S1']
filter_features = []
for column in ws.iter_cols(min_col=1, max_col=ws.max_column):
    if column[0].hidden == False:
        for col in filter_features:
            print(f"{col}")

filter_features.append(column[0].column_letter)

This will help you. Moreover, if you want to find filter rows then you must see Filtered Rows


<details>
<summary>英文:</summary>



    from openpyxl import load_workbook
    wb = load_workbook(&#39;hello.xlsx&#39;)
    ws = wb[&#39;S1&#39;]
    filter_features = []
    for column in ws.iter_cols(min_col=1, max_col=ws.max_column):
        if column[0].hidden == False:
           for col in filter_features:
              print(f&quot;{col}&quot;)
           
    filter_features.append(column[0].column_letter)





This will help you. Moreover, if you want to find filter rows then you must see 
[Filtered Rows][1]


  [1]: https://stackoverflow.com/questions/63085533/python-openpyxl-how-to-iterate-through-1-rows-columns-to-find-a-value

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Python中读取Excel文件时，我们可以知道哪一列/字段被筛选了吗？

问题

答案1

答案2

重定向URL路径到端口

Python pandas 删除列，如果它们的部分名称在列表或 pandas 列中。

in python, i am using pandas to read a csv file when its working fine tho with other functions but facing an error while using a line

如何将内存中的动画 GIF 发送到 FastAPI 端点？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论