我需要在Power Query中进行动态汇总,通过对表中重复的值进行求和或连接。

huangapple go评论80阅读模式
英文:

I need to make a dynamic aggregation in Power Query, by summing or concatenating the duplicated values in my tables

问题

这是你的数据示例:

样本 方法A 方法B 方法C 方法D 方法E 批次编号 实验室数据
样本 1 1 2 8 TX_0001 LAB1
样本 1 5 9 TX_0002 LAB2
样本 2 7 8 8 23 TX_0001 LAB1
样本 2 41 TX_0001 LAB2
样本 3 11 55 TX_0394 LAB2
样本 4 2 9 5 9 TX_0394 LAB1

我需要创建一个M语言代码,根据重复的样本将它们合并在一起。请注意,它们可能在同一批次和/或同一个实验室中,但它们不会用相同的方法制备两次。

因此,我不能传递列名,因为它们不断变化,我希望能够动态传递列名。

注意:我有可能将源表格建立为Microsoft Access的链接表,然后使用SQL来处理,但我找不到MS Access库中的文本聚合函数。在那里,每个列名都可以很容易地处理。(只是我的公司没有其他人懂M语言,我不能让这个过程不自动化)

这是我一直在尝试改进的部分,但我一直遇到一些错误:

  1. 两个分组列中的所有单元格都有“Errors”。
  2. 评估耗尽内存。

我无法找出我在这里做错了什么。

let
    Source = ALS,
    schema = Table.Schema(Source),
    columns = schema[Name],
    types = schema[Kind],
    Table = Table.FromColumns({columns,types}),
    Number_Columns = Table.SelectRows(Table, each ([Column2] = "number")),
    Other_Columns = Table.SelectRows(Table, each ([Column2] <> "number")),
    numCols = Table.Column(Number_Columns, "Column1"),
    textColsSID = List.Select(Table.ColumnNames(Source), each Table.Column(Source, _) <> type number),
    textCols = List.RemoveItems(textColsSID, {"Sample ID"}),
    groupedNum = Table.Group(Source, {"Sample ID"},List.Transform(numCols, each {_, (nmr) => List.Sum(nmr),type nullable number})),
    groupedText = Table.Group(Source,{"Sample ID"},List.Transform(textCols, each {_, (tbl) => Text.Combine(tbl, "_")}),
    merged = Table.NestedJoin(groupedNum, {"Sample ID"}, groupedText, {"Sample ID"}, "merged"),
    expanded = Table.ExpandTableColumn(merged, "merged", Table.ColumnNames(merged{1}[merged]))
in
    expanded

这是我期望的结果:

样本 方法A 方法B 方法C 方法D 方法E 批次编号 实验室数据
样本 1 1 2 5 9 8 TX_0001_TX_0002 LAB1_LAB2
样本 2 7 8 8 23 41 TX_0001_TX_0001 LAB1_LAB1
样本 3 11 55 TX_0394 LAB2
样本 4 2 9 5 9 TX_0394 LAB1
英文:

Here's an example of my data:

Sample Method A Method B Method C Method D Method E BATCH Nu Lab Data
Sample 1 1 2 8 TX_0001 LAB1
Sample 1 5 9 TX_0002 LAB2
Sample 2 7 8 8 23 TX_0001 LAB1
Sample 2 41 TX_0001 LAB2
Sample 3 11 55 TX_0394 LAB2
Sample 4 2 9 5 9 TX_0394 LAB1

I need to make a M Language code that unites them, based on duplicated samples. Note that they might be in the same batch and/or in the same lab, but they won't ever be made the same method twice.

So I can't pass the column names, because they keep changing, and I wanted to do it passaing the column names dynamically.

**OBS: I have the possibility to make a linked table of the source to a Microsoft Access and make this with SQL, but I couldn't find a text aggregation function in MS Access library. There it's possible to each column name with no problem. (Just a matter that no one else knows M Language in my company and I can't let this be non-automated)
**

This is the what I have been trying to improve, but I keep have some errors:
1.Both goruped columns have "Errors" in all of the cells
2.Evaluation running out of memory

I can't discover what I'm doing wrong here.

let
    Source = ALS,
    schema = Table.Schema(Source),
    columns = schema[Name],
    types = schema[Kind],
    Table = Table.FromColumns({columns,types}),
    Number_Columns = Table.SelectRows(Table, each ([Column2] = &quot;number&quot;)),
    Other_Columns = Table.SelectRows(Table, each ([Column2] &lt;&gt; &quot;number&quot;)),
    numCols = Table.Column(Number_Columns, &quot;Column1&quot;),
    textColsSID = List.Select(Table.ColumnNames(Source), each Table.Column(Source, _) &lt;&gt; type number),
    textCols = List.RemoveItems(textColsSID, {&quot;Sample ID&quot;}),
    groupedNum = Table.Group(Source, {&quot;Sample ID&quot;},List.Transform(numCols, each {_, (nmr) =&gt; List.Sum(nmr),type nullable number})),
    groupedText = Table.Group(Source,{&quot;Sample ID&quot;},List.Transform(textCols, each {_, (tbl) =&gt; Text.Combine(tbl, &quot;_&quot;)})),
    merged = Table.NestedJoin(groupedNum, {&quot;Sample ID&quot;}, groupedText, {&quot;Sample ID&quot;}, &quot;merged&quot;),
    expanded = Table.ExpandTableColumn(merged, &quot;merged&quot;, Table.ColumnNames(merged{1}[merged]))
in
    expanded

This is what I expected to have:

Sample Method A Method B Method C Method D Method E BATCH Nu Lab Data
Sample 1 1 2 5 9 8 TX_0001_TX_0002 LAB1_LAB2
Sample 2 7 8 8 23 41 TX_0001_TX_0001 LAB1_LAB1
Sample 3 11 55 TX_0394 LAB2
Sample 4 2 9 5 9 TX_0394 LAB1

答案1

得分: 2

这是一个方法,它仅假定第一列是用于分组不同样本的列。

它不对任何列名或列数做出假设。

它测试每列的前10行(去除任何空值后)以确定列的类型是否可以是“数值类型”,否则它将假定为“文本类型”。

如果存在其他可能的数据类型,类型检测代码可以进行扩展。

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

//从前十行动态检测数据类型
//仅检测“文本”和“数值”
    colNames = Table.ColumnNames(Source),
    checkRows = 10,
    colTestTypes = List.Generate(
        ()=>[t=
            let 
                Values = List.FirstN(Table.Column(Source,colNames{0}),10),
                tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From(_))[HasError])
            in 
                tryNumber,  idx=0],
        each [idx] < List.Count(colNames),
        each [t=
            let 
                Values = List.FirstN(Table.Column(Source,colNames{[idx]+1}),10),
                tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From(_))[HasError])
            in 
                tryNumber,  idx=[idx]+1],
        each [t]),
    colTypes = List.Transform(colTestTypes, each if List.AllTrue(_) then type text else type number),

//基于第一列分组并求和或连接列
    group = Table.Group(Source,{colNames{0}},
        {"rw", (t)=>
        Record.FromList(
            List.Generate(
                ()=>[rw=if colTypes{1} = type number  
                            then List.Sum(Table.Column(t,colNames{1}))
                            else Text.Combine(Table.Column(t,colNames{1}),"_"), 
                        idx=1],
                    each [idx] < List.Count(colNames),
                    each [rw=if colTypes{[idx]+1} = type number  
                                then List.Sum(Table.Column(t,colNames{[idx]+1}))
                                else Text.Combine(Table.Column(t,colNames{[idx]+1}),"_"), 
                            idx=[idx]+1],
                    each [rw]), List.RemoveFirstN(colNames,1)), type record}
        ),

//展开记录列并设置数据类型
    #"Expanded rw" = Table.ExpandRecordColumn(group, "rw", List.RemoveFirstN(colNames,1)),
    #"Set Data Type" = Table.TransformColumnTypes(#"Expanded rw", List.Zip({colNames, colTypes}))

in
    #"Set Data Type"

原始数据
我需要在Power Query中进行动态汇总,通过对表中重复的值进行求和或连接。

结果
我需要在Power Query中进行动态汇总,通过对表中重复的值进行求和或连接。


<details>
<summary>英文:</summary>
Here is a method which assumes only that the first column is a column which will be used to group the different samples.
It makes no assumptions about any column names, or the numbers of columns.
It tests the first 10 rows in each column (after removing any nulls) to determine if the column type can be `type number`, otherwise it will assume `type text`.
If there are other possible data types, the type detection code can be expanded.

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

//dynamically detect data types from first ten rows
//only detecting "text" and "number"
colNames = Table.ColumnNames(Source),
checkRows = 10,
colTestTypes = List.Generate(
()=>[t=
let
Values = List.FirstN(Table.Column(Source,colNames{0}),10),
tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From())[HasError])
in
tryNumber, idx=0],
each [idx] < List.Count(colNames),
each [t=
let
Values = List.FirstN(Table.Column(Source,colNames{[idx]+1}),10),
tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From(
))[HasError])
in
tryNumber, idx=[idx]+1],
each [t]),
colTypes = List.Transform(colTestTypes, each if List.AllTrue(_) then type text else type number),

//Group and Sum or Concatenate columns, keying on the first column
group = Table.Group(Source,{colNames{0}},
{"rw", (t)=>
Record.FromList(
List.Generate(
()=>[rw=if colTypes{1} = type number
then List.Sum(Table.Column(t,colNames{1}))
else Text.Combine(Table.Column(t,colNames{1}),""),
idx=1],
each [idx] < List.Count(colNames),
each [rw=if colTypes{[idx]+1} = type number
then List.Sum(Table.Column(t,colNames{[idx]+1}))
else Text.Combine(Table.Column(t,colNames{[idx]+1}),"
"),
idx=[idx]+1],
each [rw]), List.RemoveFirstN(colNames,1)), type record}
),

//expand the record column and set the data types
#"Expanded rw" = Table.ExpandRecordColumn(group, "rw", List.RemoveFirstN(colNames,1)),
#"Set Data Type" = Table.TransformColumnTypes(#"Expanded rw", List.Zip({colNames, colTypes}))

in
#"Set Data Type"

***Original Data***    
[![enter image description here][1]][1]
***Results***    
[![enter image description here][2]][2]
[1]: https://i.stack.imgur.com/WEVpC.png
[2]: https://i.stack.imgur.com/jrPZx.png
</details>
# 答案2
**得分**: 1
以下是代码的翻译部分:
```javascript
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
names = List.Distinct(List.Select(Table.ColumnNames(Source), each Text.Contains(_,"Method")),
"Grouped Rows" = Table.Group(Source, {"Sample"}, {{"data", each _, type table }}),
"Added Custom" = Table.AddColumn("Grouped Rows", "Batch Nu", each Text.Combine(List.Distinct([data][BATCH Nu]),"_")),
"Added Custom1" = Table.AddColumn("Added Custom", "Lab Data", each Text.Combine(List.Distinct([data][Lab Data]),"_")),
"Added Custom2" = Table.AddColumn("Added Custom1", "Custom", each Table.SelectRows(Table.UnpivotOtherColumns([data], {"Sample"}, "Attribute", "Value"), each List.Contains(names,[Attribute])),
"Added Custom3" = Table.AddColumn("Added Custom2", "Custom.1", each Table.Pivot([Custom], List.Distinct([Custom][Attribute]), "Attribute", "Value", List.Sum)),
"Expanded Custom.1" = Table.ExpandTableColumn("Added Custom3" , "Custom.1", names,names),
"Removed Columns" = Table.RemoveColumns("Expanded Custom.1",{"data", "Custom"})
in "Removed Columns"

希望这对您有所帮助。

英文:

One way. You could probably do this all within the group as well

let Source = Excel.CurrentWorkbook(){[Name=&quot;Table1&quot;]}[Content],
names = List.Distinct(List.Select(Table.ColumnNames(Source), each Text.Contains(_,&quot;Method&quot;))),
#&quot;Grouped Rows&quot; = Table.Group(Source, {&quot;Sample&quot;}, {{&quot;data&quot;, each _, type table }}),
#&quot;Added Custom&quot; = Table.AddColumn(#&quot;Grouped Rows&quot;, &quot;Batch Nu&quot;, each Text.Combine(List.Distinct([data][BATCH Nu]),&quot;_&quot;)),
#&quot;Added Custom1&quot; = Table.AddColumn(#&quot;Added Custom&quot;, &quot;Lab Data&quot;, each Text.Combine(List.Distinct([data][Lab Data]),&quot;_&quot;)),
#&quot;Added Custom2&quot; = Table.AddColumn(#&quot;Added Custom1&quot;, &quot;Custom&quot;, each Table.SelectRows(Table.UnpivotOtherColumns([data], {&quot;Sample&quot;}, &quot;Attribute&quot;, &quot;Value&quot;), each List.Contains(names,[Attribute]))),
#&quot;Added Custom3&quot; = Table.AddColumn(#&quot;Added Custom2&quot;, &quot;Custom.1&quot;, each Table.Pivot([Custom], List.Distinct([Custom][Attribute]), &quot;Attribute&quot;, &quot;Value&quot;, List.Sum)),
#&quot;Expanded Custom.1&quot; = Table.ExpandTableColumn(#&quot;Added Custom3&quot; , &quot;Custom.1&quot;, names,names),
#&quot;Removed Columns&quot; = Table.RemoveColumns(#&quot;Expanded Custom.1&quot;,{&quot;data&quot;, &quot;Custom&quot;})
in #&quot;Removed Columns&quot;

我需要在Power Query中进行动态汇总,通过对表中重复的值进行求和或连接。

huangapple
  • 本文由 发表于 2023年2月19日 02:42:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75495569.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定