Power Query – filtering after grouping only in situations where there is more than one same value per one grouped value – is it possible?

huangapple go评论54阅读模式
英文:

Power Query - filtering after grouping only in situations where there is more than one same value per one grouped value - is it possible?

问题

在Power Query中有一个表格:

保险索赔编号 子类别代码 批准日期 供应商名称
IE0225873 I_REP_DAM 12.10.2022 XERO
IE0225873 I_OTHER 12.10.2022 NERO
IE0225874 I_REP_DAM 12.10.2022 XERO
IE0225874 I_OTHER 13.10.2022 NERO
IE0225874 I_OTHER 12.10.2022 NERO
IE0225875 I_INS 20.10.2022 XERO
IE0225875 I_REP_DAM 20.10.2022 NERO
IE0225876 I_DAM 20.11.2022 XERO
IE0225876 I_REP 30.12.2022 NERO

期望的表格如下:

保险索赔编号 子类别代码 批准日期 供应商名称
IE0225873 I_REP_DAM 12.10.2022 XERO
IE0225874 I_REP_DAM 12.10.2022 XERO
IE0225874 I_OTHER 13.10.2022 NERO
IE0225875 I_REP_DAM 20.10.2022 NERO
IE0225876 I_DAM 20.11.2022 XERO
IE0225876 I_REP 30.12.2022 NERO

首先,我对表格中的值进行了分组,因为我只想要每个“保险索赔编号”一行,但有时会出现多个相同的“批准日期”对应一个“保险索赔编号”,导致出现重复的行。当出现这种情况时,我希望根据条件“子类别代码”列=“I_REP_DAM”来筛选行。这个条件仅在对分组后的“保险索赔编号”有多个“批准日期”值时才适用。如果只有一个“批准日期”对应一个“保险索赔编号”,则可以是任何“子类别代码”。在提供的示例中,期望结果是第一行,其中“供应商名称”=“XERO”。

考虑到“供应商名称”列,这个逻辑当然不会奏效。

我考虑使用“分组”功能,对具有多个相同日期值的唯一“保险索赔编号”计数,然后我将添加另一个条件列,如果对于唯一的“保险索赔编号”只有一个唯一的日期,那么它将显示“筛选”,如果有多于1个日期并且“子类别代码”= I_REP_DAM,那么也是“筛选”,否则是“不筛选”,然后我可以筛选“筛选”。但是,使用“分组”隐藏了“子类别代码”,所以公式将无法工作,如果显示“子类别代码”,将无法正确计数,因为每一行都会计为1。

如果没有“供应商名称”,我可以轻松解决这个问题,添加条件列“IF”,如果“子类别代码”=“I_REP_DAM”,则设置为1,否则设置为0,然后我分组所有列(除了“子类别代码”)作为聚合,将条件列“IF”的总和设置为聚合,这样我总是可以获得每个“保险编号”对应一个“批准日期”的一行。

有没有办法解决这个问题?

非常感谢任何提示。

英文:

Have table in power query:

Insurance Claim No Subcategory Code Approval Date Vendor Name
IE0225873 I_REP_DAM 12.10.2022 XERO
IE0225873 I_OTHER 12.10.2022 NERO
IE0225874 I_REP_DAM 12.10.2022 XERO
IE0225874 I_OTHER 13.10.2022 NERO
IE0225874 I_OTHER 12.10.2022 NERO
IE0225875 I_INS 20.10.2022 XERO
IE0225875 I_REP_DAM 20.10.2022 NERO
IE0225876 I_DAM 20.11.2022 XERO
IE0225876 I_REP 30.12.2022 NERO

Desired table would be:

Insurance Claim No Subcategory Code Approval Date Vendor Name
IE0225873 I_REP_DAM 12.10.2022 XERO
IE0225874 I_REP_DAM 12.10.2022 XERO
IE0225874 I_OTHER 13.10.2022 NERO
IE0225875 I_REP_DAM 20.10.2022 NERO
IE0225876 I_DAM 20.11.2022 XERO
IE0225876 I_REP 30.12.2022 NERO

Firstly I grouped values in my table because I want only one row per one "Insurance Claim No", however there are situations where there can be more than one same "Approval Date" per one "Insurance Claim No" making duplicity rows. When this situation occurs I want to filter rows according to a condition "Subcategory Code" column = "I_REP_DAM". This condition must only apply when there is more than one value of "Approval Date" column per grouped value "Insurance Claim No". If there is only one "Approval Date" per one "Insurance Claim No" there can be any "Subcategory Code". In the example provided the desired result would be first row with "Vendor Name" = "XERO".

Considering also column "Vendor Name" this logic ofc will not work.

I was thinking about logic where with Group By I count the rows for unique "Insurance Claim No" that has more than 1 same date value and then I would add another conditional column saying if I have only one unique date for one unique Insurance Claim No it says "Filter" or if I have more than 1 and "Subcategory Code" = I_REP_DAM then it will be also "Filter" otherwise its "Not filter" and then I can just filter the "Filter". But this logic with Group By hides the "Subcategory Code" so the formula will not work, if I show the "Subcategory Code" there will not be correct counts because there will be Count 1 for every row.

If there would be no "Vendor Name" I could easily solve this problem adding conditional column "IF" saying if "Subcategory Code" = "I_REP_DAM" put 1 otherwise 0, then I grouped all columns without the "Subcategory Code" and as an aggregation I set Sum the conditional column "IF" and this way I always get only one row for one Insurance No with one value of "Approval Date".

Is it possible somehow?

Any hint would be greatly appreciated.

答案1

得分: 0

这是一个快速简单的方法。

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"更改类型" = Table.TransformColumnTypes(Source,{{"Insurance Claim No", type text}, {"Subcategory Code", type text}, {"Approval Date", type date}, {"Vendor Name", type text}}),
    #"排序行" = Table.Sort(#"更改类型",{{"Insurance Claim No", Order.Ascending}, {"Subcategory Code", Order.Descending}}),
    #"分组行" = Table.Group(#"排序行", {"Insurance Claim No", "Approval Date"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"All", each _, type table [Insurance Claim No=nullable text, Subcategory Code=nullable text, Approval Date=nullable date, Vendor Name=nullable text]}}),
    #"添加自定义" = Table.AddColumn(#"分组行", "Custom", each Table.First([All])),
    #"删除列" = Table.RemoveColumns(#"添加自定义",{"Insurance Claim No", "Approval Date", "Count", "All"}),
    #"展开自定义" = Table.ExpandRecordColumn(#"删除列", "Custom", {"Insurance Claim No", "Subcategory Code", "Approval Date", "Vendor Name"}, {"Insurance Claim No", "Subcategory Code", "Approval Date", "Vendor Name"})
in
    #"展开自定义"
英文:

Power Query – filtering after grouping only in situations where there is more than one same value per one grouped value – is it possible?

This is a quick and easy way.

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Insurance Claim No", type text}, {"Subcategory Code", type text}, {"Approval Date", type date}, {"Vendor Name", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Insurance Claim No", Order.Ascending}, {"Subcategory Code", Order.Descending}}),
    #"Grouped Rows" = Table.Group(#"Sorted Rows", {"Insurance Claim No", "Approval Date"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"All", each _, type table [Insurance Claim No=nullable text, Subcategory Code=nullable text, Approval Date=nullable date, Vendor Name=nullable text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.First([All])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Insurance Claim No", "Approval Date", "Count", "All"}),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Removed Columns", "Custom", {"Insurance Claim No", "Subcategory Code", "Approval Date", "Vendor Name"}, {"Insurance Claim No", "Subcategory Code", "Approval Date", "Vendor Name"})
in
    #"Expanded Custom"

huangapple
  • 本文由 发表于 2023年2月27日 16:40:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75578316.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定