2023年2月19日 02:42:31go评论80阅读模式

英文:

I need to make a dynamic aggregation in Power Query, by summing or concatenating the duplicated values in my tables

问题

这是你的数据示例：

样本	方法A	方法B	方法C	方法D	方法E	批次编号	实验室数据
样本 1	1	2			8	TX_0001	LAB1
样本 1			5	9		TX_0002	LAB2
样本 2	7	8	8	23		TX_0001	LAB1
样本 2					41	TX_0001	LAB2
样本 3			11	55		TX_0394	LAB2
样本 4	2	9	5	9		TX_0394	LAB1

我需要创建一个M语言代码，根据重复的样本将它们合并在一起。请注意，它们可能在同一批次和/或同一个实验室中，但它们不会用相同的方法制备两次。

因此，我不能传递列名，因为它们不断变化，我希望能够动态传递列名。

注意：我有可能将源表格建立为Microsoft Access的链接表，然后使用SQL来处理，但我找不到MS Access库中的文本聚合函数。在那里，每个列名都可以很容易地处理。（只是我的公司没有其他人懂M语言，我不能让这个过程不自动化）

这是我一直在尝试改进的部分，但我一直遇到一些错误：

两个分组列中的所有单元格都有“Errors”。
评估耗尽内存。

我无法找出我在这里做错了什么。

let
    Source = ALS,
    schema = Table.Schema(Source),
    columns = schema[Name],
    types = schema[Kind],
    Table = Table.FromColumns({columns,types}),
    Number_Columns = Table.SelectRows(Table, each ([Column2] = "number")),
    Other_Columns = Table.SelectRows(Table, each ([Column2] <> "number")),
    numCols = Table.Column(Number_Columns, "Column1"),
    textColsSID = List.Select(Table.ColumnNames(Source), each Table.Column(Source, _) <> type number),
    textCols = List.RemoveItems(textColsSID, {"Sample ID"}),
    groupedNum = Table.Group(Source, {"Sample ID"},List.Transform(numCols, each {_, (nmr) => List.Sum(nmr),type nullable number})),
    groupedText = Table.Group(Source,{"Sample ID"},List.Transform(textCols, each {_, (tbl) => Text.Combine(tbl, "_")}),
    merged = Table.NestedJoin(groupedNum, {"Sample ID"}, groupedText, {"Sample ID"}, "merged"),
    expanded = Table.ExpandTableColumn(merged, "merged", Table.ColumnNames(merged{1}[merged]))
in
    expanded

这是我期望的结果：

样本	方法A	方法B	方法C	方法D	方法E	批次编号	实验室数据
样本 1	1	2	5	9	8	TX_0001_TX_0002	LAB1_LAB2
样本 2	7	8	8	23	41	TX_0001_TX_0001	LAB1_LAB1
样本 3			11	55		TX_0394	LAB2
样本 4	2	9	5	9		TX_0394	LAB1

英文:

Here's an example of my data:

Sample	Method A	Method B	Method C	Method D	Method E	BATCH Nu	Lab Data
Sample 1	1	2			8	TX_0001	LAB1
Sample 1			5	9		TX_0002	LAB2
Sample 2	7	8	8	23		TX_0001	LAB1
Sample 2					41	TX_0001	LAB2
Sample 3			11	55		TX_0394	LAB2
Sample 4	2	9	5	9		TX_0394	LAB1

I need to make a M Language code that unites them, based on duplicated samples. Note that they might be in the same batch and/or in the same lab, but they won't ever be made the same method twice.

So I can't pass the column names, because they keep changing, and I wanted to do it passaing the column names dynamically.

**OBS: I have the possibility to make a linked table of the source to a Microsoft Access and make this with SQL, but I couldn't find a text aggregation function in MS Access library. There it's possible to each column name with no problem. (Just a matter that no one else knows M Language in my company and I can't let this be non-automated)
**

This is the what I have been trying to improve, but I keep have some errors:
1.Both goruped columns have "Errors" in all of the cells
2.Evaluation running out of memory

I can't discover what I'm doing wrong here.

let
    Source = ALS,
    schema = Table.Schema(Source),
    columns = schema[Name],
    types = schema[Kind],
    Table = Table.FromColumns({columns,types}),
    Number_Columns = Table.SelectRows(Table, each ([Column2] = &quot;number&quot;)),
    Other_Columns = Table.SelectRows(Table, each ([Column2] &lt;&gt; &quot;number&quot;)),
    numCols = Table.Column(Number_Columns, &quot;Column1&quot;),
    textColsSID = List.Select(Table.ColumnNames(Source), each Table.Column(Source, _) &lt;&gt; type number),
    textCols = List.RemoveItems(textColsSID, {&quot;Sample ID&quot;}),
    groupedNum = Table.Group(Source, {&quot;Sample ID&quot;},List.Transform(numCols, each {_, (nmr) =&gt; List.Sum(nmr),type nullable number})),
    groupedText = Table.Group(Source,{&quot;Sample ID&quot;},List.Transform(textCols, each {_, (tbl) =&gt; Text.Combine(tbl, &quot;_&quot;)})),
    merged = Table.NestedJoin(groupedNum, {&quot;Sample ID&quot;}, groupedText, {&quot;Sample ID&quot;}, &quot;merged&quot;),
    expanded = Table.ExpandTableColumn(merged, &quot;merged&quot;, Table.ColumnNames(merged{1}[merged]))
in
    expanded

This is what I expected to have:

Sample	Method A	Method B	Method C	Method D	Method E	BATCH Nu	Lab Data
Sample 1	1	2	5	9	8	TX_0001_TX_0002	LAB1_LAB2
Sample 2	7	8	8	23	41	TX_0001_TX_0001	LAB1_LAB1
Sample 3			11	55		TX_0394	LAB2
Sample 4	2	9	5	9		TX_0394	LAB1

答案1

得分: 2

这是一个方法，它仅假定第一列是用于分组不同样本的列。

它不对任何列名或列数做出假设。

它测试每列的前10行（去除任何空值后）以确定列的类型是否可以是“数值类型”，否则它将假定为“文本类型”。

如果存在其他可能的数据类型，类型检测代码可以进行扩展。

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

//从前十行动态检测数据类型
//仅检测“文本”和“数值”
    colNames = Table.ColumnNames(Source),
    checkRows = 10,
    colTestTypes = List.Generate(
        ()=>[t=
            let 
                Values = List.FirstN(Table.Column(Source,colNames{0}),10),
                tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From(_))[HasError])
            in 
                tryNumber,  idx=0],
        each [idx] < List.Count(colNames),
        each [t=
            let 
                Values = List.FirstN(Table.Column(Source,colNames{[idx]+1}),10),
                tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From(_))[HasError])
            in 
                tryNumber,  idx=[idx]+1],
        each [t]),
    colTypes = List.Transform(colTestTypes, each if List.AllTrue(_) then type text else type number),

//基于第一列分组并求和或连接列
    group = Table.Group(Source,{colNames{0}},
        {"rw", (t)=>
        Record.FromList(
            List.Generate(
                ()=>[rw=if colTypes{1} = type number  
                            then List.Sum(Table.Column(t,colNames{1}))
                            else Text.Combine(Table.Column(t,colNames{1}),"_"), 
                        idx=1],
                    each [idx] < List.Count(colNames),
                    each [rw=if colTypes{[idx]+1} = type number  
                                then List.Sum(Table.Column(t,colNames{[idx]+1}))
                                else Text.Combine(Table.Column(t,colNames{[idx]+1}),"_"), 
                            idx=[idx]+1],
                    each [rw]), List.RemoveFirstN(colNames,1)), type record}
        ),

//展开记录列并设置数据类型
    #"Expanded rw" = Table.ExpandRecordColumn(group, "rw", List.RemoveFirstN(colNames,1)),
    #"Set Data Type" = Table.TransformColumnTypes(#"Expanded rw", List.Zip({colNames, colTypes}))

in
    #"Set Data Type"

原始数据

结果


<details>
<summary>英文:</summary>
Here is a method which assumes only that the first column is a column which will be used to group the different samples.
It makes no assumptions about any column names, or the numbers of columns.
It tests the first 10 rows in each column (after removing any nulls) to determine if the column type can be `type number`, otherwise it will assume `type text`.
If there are other possible data types, the type detection code can be expanded.

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

//dynamically detect data types from first ten rows
//only detecting "text" and "number"
colNames = Table.ColumnNames(Source),
checkRows = 10,
colTestTypes = List.Generate(
()=>[t=
let
Values = List.FirstN(Table.Column(Source,colNames{0}),10),
tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From())[HasError])
in
tryNumber, idx=0],
each [idx] < List.Count(colNames),
each [t=
let
Values = List.FirstN(Table.Column(Source,colNames{[idx]+1}),10),
tryNumber = List.Transform(List.RemoveNulls(Values), each (try Number.From())[HasError])
in
tryNumber, idx=[idx]+1],
each [t]),
colTypes = List.Transform(colTestTypes, each if List.AllTrue(_) then type text else type number),

//Group and Sum or Concatenate columns, keying on the first column
group = Table.Group(Source,{colNames{0}},
{"rw", (t)=>
Record.FromList(
List.Generate(
()=>[rw=if colTypes{1} = type number
then List.Sum(Table.Column(t,colNames{1}))
else Text.Combine(Table.Column(t,colNames{1}),""),
idx=1],
each [idx] < List.Count(colNames),
each [rw=if colTypes{[idx]+1} = type number
then List.Sum(Table.Column(t,colNames{[idx]+1}))
else Text.Combine(Table.Column(t,colNames{[idx]+1}),""),
idx=[idx]+1],
each [rw]), List.RemoveFirstN(colNames,1)), type record}
),

//expand the record column and set the data types
#"Expanded rw" = Table.ExpandRecordColumn(group, "rw", List.RemoveFirstN(colNames,1)),
#"Set Data Type" = Table.TransformColumnTypes(#"Expanded rw", List.Zip({colNames, colTypes}))

in
#"Set Data Type"

***Original Data***    
[![enter image description here][1]][1]
***Results***    
[![enter image description here][2]][2]
[1]: https://i.stack.imgur.com/WEVpC.png
[2]: https://i.stack.imgur.com/jrPZx.png
</details>
# 答案2
**得分**: 1
以下是代码的翻译部分：
```javascript
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
names = List.Distinct(List.Select(Table.ColumnNames(Source), each Text.Contains(_,"Method")),
"Grouped Rows" = Table.Group(Source, {"Sample"}, {{"data", each _, type table }}),
"Added Custom" = Table.AddColumn("Grouped Rows", "Batch Nu", each Text.Combine(List.Distinct([data][BATCH Nu]),"_")),
"Added Custom1" = Table.AddColumn("Added Custom", "Lab Data", each Text.Combine(List.Distinct([data][Lab Data]),"_")),
"Added Custom2" = Table.AddColumn("Added Custom1", "Custom", each Table.SelectRows(Table.UnpivotOtherColumns([data], {"Sample"}, "Attribute", "Value"), each List.Contains(names,[Attribute])),
"Added Custom3" = Table.AddColumn("Added Custom2", "Custom.1", each Table.Pivot([Custom], List.Distinct([Custom][Attribute]), "Attribute", "Value", List.Sum)),
"Expanded Custom.1" = Table.ExpandTableColumn("Added Custom3" , "Custom.1", names,names),
"Removed Columns" = Table.RemoveColumns("Expanded Custom.1",{"data", "Custom"})
in "Removed Columns"

希望这对您有所帮助。

英文:

One way. You could probably do this all within the group as well

let Source = Excel.CurrentWorkbook(){[Name=&quot;Table1&quot;]}[Content],
names = List.Distinct(List.Select(Table.ColumnNames(Source), each Text.Contains(_,&quot;Method&quot;))),
#&quot;Grouped Rows&quot; = Table.Group(Source, {&quot;Sample&quot;}, {{&quot;data&quot;, each _, type table }}),
#&quot;Added Custom&quot; = Table.AddColumn(#&quot;Grouped Rows&quot;, &quot;Batch Nu&quot;, each Text.Combine(List.Distinct([data][BATCH Nu]),&quot;_&quot;)),
#&quot;Added Custom1&quot; = Table.AddColumn(#&quot;Added Custom&quot;, &quot;Lab Data&quot;, each Text.Combine(List.Distinct([data][Lab Data]),&quot;_&quot;)),
#&quot;Added Custom2&quot; = Table.AddColumn(#&quot;Added Custom1&quot;, &quot;Custom&quot;, each Table.SelectRows(Table.UnpivotOtherColumns([data], {&quot;Sample&quot;}, &quot;Attribute&quot;, &quot;Value&quot;), each List.Contains(names,[Attribute]))),
#&quot;Added Custom3&quot; = Table.AddColumn(#&quot;Added Custom2&quot;, &quot;Custom.1&quot;, each Table.Pivot([Custom], List.Distinct([Custom][Attribute]), &quot;Attribute&quot;, &quot;Value&quot;, List.Sum)),
#&quot;Expanded Custom.1&quot; = Table.ExpandTableColumn(#&quot;Added Custom3&quot; , &quot;Custom.1&quot;, names,names),
#&quot;Removed Columns&quot; = Table.RemoveColumns(#&quot;Expanded Custom.1&quot;,{&quot;data&quot;, &quot;Custom&quot;})
in #&quot;Removed Columns&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我需要在Power Query中进行动态汇总，通过对表中重复的值进行求和或连接。

问题

答案1

有没有办法从 Power Query M 调用 Excel VBA 函数？

使用MongoDB中的聚合来获取特定字段的值

我想在 Power Query 中的筛选行上添加列。

从长字符串中提取文本使用 Power Query

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论