英文:
Count and Sum if column name is exact match
问题
我正在处理一个大数据集(500多列 x 10,000行)。我试图获取每行所有变量的计数和总和。所有列名都列在一个单独的工作表/表格中。怎样是最好的方法?谢谢。
英文:
I am working with a large dataset (500+ columns X 10,000 rows). I am trying to get a count and sum of all variables within a row. All column names are listed on a separate worksheet/table. Whats the best way to go about it? Thanks
Main Data: Raw Data
Summary: Where I am trying to get the count and sum
答案1
得分: 1
公式:
使用Microsoft 365的一种方法是:
在G3
单元格中的公式:
=LET(z,B2:E7,WRAPROWS(TOCOL(VSTACK(TAKE(z,1),BYCOL(DROP(z,1),LAMBDA(x,COUNT(x))),BYCOL(DROP(z,1),LAMBDA(y,SUM(y)))),,1),3))
它可能更简洁,但是这种方法只需要在无论哪个工作表上选择一次B2:E7
。
PowerQuery:
另一种方法是使用PowerQuery,您需要:
- 选择(一个单元格或)整个范围
B2:E7
; - 在“数据”选项卡中,在“获取和转换数据”组中点击“从表/范围获取”;
- 选择包含标题,并点击“确定”。PowerQuery启动;
- 选择所有列,在“转换”选项卡中,在“任意列”组下点击“取消列拆分”;
- 选择第一个列,“属性”在这种情况下,在相同的选项卡“转换”下选择“按表分组”;
- 点击高级选项并创建两个聚合,然后点击“确定”:
- 关闭并加载数据回Excel。
英文:
Formula:
One way with ms365 could be:
Formula in G3
:
=LET(z,B2:E7,WRAPROWS(TOCOL(VSTACK(TAKE(z,1),BYCOL(DROP(z,1),LAMBDA(x,COUNT(x))),BYCOL(DROP(z,1),LAMBDA(y,SUM(y)))),,1),3))
It can be less verbose, but this way you'd just need to select B2:E7
once, on whatever worksheet it is located.
PowerQuery:
Another way would be PowerQuery, where you'd:
- Select (a single cell or ) the whole range
B2:E7
; - On the 'Data' tab, click 'From Table/Range' in the 'Get & Transform Data' group;
- Choose to include headers, and hit 'OK'. PQ launches;
- Select all column and in the 'Transform' tab, click 'Unpivot Columns' under the 'Any Column' group;
- Select the 1st column, 'Attribute' in this case, and on the same tab 'Transform' choose to 'Group By' under the 'Table' group;
- Hit the advanced option and create two aggregations before hitting 'OK':
- Close and load data back to Excel.
答案2
得分: 0
以下是翻译好的内容:
"rgSrc" 是 Sheet1 中数据的范围。
"rgTrg" 是从单元格 A2 到 Sheet2 中最后一行的范围。该代码假定在 sheet2 的列A数据中不会有空单元格。
代码循环遍历 rgTrg 中的每个单元格(遍历每个需求的列标题),然后将 rgSrc 中具有循环的单元格值(即列标题)的数据计数放入单元格偏移量(0,1)中,并将其总和放入单元格偏移量(0,2)中。
英文:
If you Summary (Trying) column name is not fixed ... something like this :
So, the Sheet1 is data with 4 columns header, but in Sheet2, the "demand" is only 3 columns and not in sequence. (and later on, maybe the "demand" change to only 2 columns, later on the "demand" to all 4 columns, etc and also the header name always change).
The expected result in Sheet2:<br>
So, if the cell has value, even if it's a zero value in the data column A then it is counted. Only a blank cell is not counted.
Sub test()
Dim rgSrc As Range: Dim rgTrg As Range: Dim c As Range
Set rgSrc = Sheets("Sheet1").UsedRange 'change as needed
With Sheets("Sheet2")
Set rgTrg = .Range("A2", .Range("A2").End(xlDown)) 'change as needed
End With
For Each cell In rgTrg
Set c = rgSrc.Rows(1).Find(cell.Value, lookat:=xlWhole)
If Not c Is Nothing Then
With rgSrc.Columns(c.Column)
cell.Offset(0, 1).Value = Application.CountA(.Cells) - 1
cell.Offset(0, 2).Value = Application.Sum(.Cells)
End With
End If
Next
End Sub
rgSrc is the range of the data in sheet1.<br>
rgTrg is the range from cell A2 to whatever lastrow in sheet2. The code assumed that there'll be no blank cell in between data in column A of sheet2.
It loop to each cell in rgTrg (loop to each demanded header name), then put the count of the data in rgSrc with the looped cell value (which is the header name) in cell.offset(0,1) and put the sum of it in cell.offset(0,2).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论