英文:
Removing duplicates, keeping information and summarizing VBA
问题
以下是您提供的内容的翻译:
我正在尝试通过VBA处理一些数据,但在处理数据时,我很难找到正确且耗时较少的方法。我有一个包含从A到V的数据的Excel工作表,其中包括动态行和标题。但是,有很多数据是我不需要的。因此,我的任务是获取列K、P、Q、T、U和V的唯一组合/值,然后我想获取这些唯一的值并总结列O中的内容。之后,我想将这些内容打印到一个新的工作表上,其中包含了生成组合的所有列。它可能如下所示:
HeadK | HeadP | HeadQ | HeadT | HeadU | HeadV | HeadO |
---|---|---|---|---|---|---|
Proj1 | Actual | 12 | 2022 | Constrained | 5 | |
Proj2 | Actual | 12 | 2022 | Constrained | 1 | |
Proj1 | Actual | 12 | 2022 | Constrained | 3 | |
Proj2 | Actual | 5 | 2022 | Constrained | 10 |
这个想法只是告诉你,有很多列可能有很多组合。但在这种情况下,第1行和第3行可以包含在1行中,而不是以8个HeadO的方式呈现。
有谁可以帮助我吗?
实际上,我尝试了一些来自ChatGPT的代码,但无法使其正常工作。我知道不允许使用它来回答问题,但我想这是我自己的帖子,所以我可以承认我自己的错误....
英文:
So I am trying to handle some data through VBA and I have a hard time figuring out how to do this the correct and less time consuming way when the data is getting handled.
I have an excel sheet that contains data from A:V with dynamic rows and including headers. But there is a lot of the data I do not need for any reasons.
So, my task is to take column K, P, Q, T, U, and V, and find all the unique combinations/values from this. Then I want to take this unique value and summarize what is in Column O. Afterwards I want to print this to a new sheet, where I have printed all the columns that made the combination. It could look like this:
HeadK | HeadP | HeadQ | HeadT | HeadU | HeadV | HeadO |
---|---|---|---|---|---|---|
Proj1 | Actual | 12 | 2022 | Constrained | 5 | |
Proj2 | Actual | 12 | 2022 | Constrained | 1 | |
Proj1 | Actual | 12 | 2022 | Constrained | 3 | |
Proj2 | Actual | 5 | 2022 | Constrained | 10 |
The idea is just to tell that there can be a lot of combinations with so many columns. But in this case line 1 and 3 could be contained in 1 row, and instead present 8 in head0.
Can anybody help me with this?
I have actually tried some code from chatgpt, but can't get it to work correctly. I know it is not allowed to answer with it, but guess this is my own post, so I can admit my own mistakes....
Sub SummarizeData()
Dim ws As Worksheet
Dim dataRange As Range
Dim uniqueValues As Collection
Dim cell As Range
Dim uniqueValue As Variant
Dim summaryArray() As Variant
Dim summaryIndex As Long
' Define the worksheet
Set ws = ThisWorkbook.Sheets("TimeRegistrations_Billable")
' Define the data range
Set dataRange = ws.Range("K2:V" & ws.Range("K" & ws.Rows.Count).End(xlUp).Row)
' Create a collection to store unique values
Set uniqueValues = New Collection
' Iterate over the data range
For Each cell In dataRange.Columns(1).Cells
uniqueValue = cell.Value & cell.Offset(0, 6).Value & cell.Offset(0, 12).Value & cell.Offset(0, 13).Value & cell.Offset(0, 14).Value
On Error Resume Next
uniqueValues.Add uniqueValue, uniqueValue
On Error GoTo 0
uniqueValues.Add cell.Value, cell.Value & cell.Offset(0, 6).Value & cell.Offset(0, 12).Value & cell.Offset(0, 13).Value & cell.Offset(0, 14).Value & "K"
uniqueValues.Add cell.Offset(0, 10).Value, cell.Value & cell.Offset(0, 6).Value & cell.Offset(0, 12).Value & cell.Offset(0, 13).Value & cell.Offset(0, 14).Value & "T"
uniqueValues.Add cell.Offset(0, 11).Value, cell.Value & cell.Offset(0, 6).Value & cell.Offset(0, 12).Value & cell.Offset(0, 13).Value & cell.Offset(0, 14).Value & "U"
Next cell
' Create an array to store the summarized data
ReDim summaryArray(1 To uniqueValues.Count, 1 To 5)
summaryIndex = 0
' Iterate over the unique values
For Each uniqueValue In uniqueValues
summaryIndex = summaryIndex + 1
summaryArray(summaryIndex, 1) = uniqueValue
summaryArray(summaryIndex, 2) = WorksheetFunction.SumIf(dataRange.Columns(15), uniqueValue, dataRange.Columns(15))
summaryArray(summaryIndex, 3) = uniqueValue & "K"
summaryArray(summaryIndex, 4) = uniqueValue & "T"
summaryArray(summaryIndex, 5) = uniqueValue & "U"
Next uniqueValue
' Add the summarized data to a new worksheet
With ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count))
.Name = "Tester"
.Range("A1").Resize(uniqueValues.Count, 5).Value = summaryArray
End With
End Sub
答案1
得分: 0
我实际上已经找到了一种方法来做到这一点,我只需要删除第23列中的所有重复项。但是当你有50万行数据时,运行起来就像是花了很长时间。
不知道这段代码是否更合理。
但是当在数据1的第23列中删除了重复项后,我就得到了我想要的数据(还包括了一些不需要的列)。
英文:
So I actually have found 1 way to do it, I just need to remove all the duplicates in column 23 though. But this takes like ages to run, when you have 500k lines.
Don't know if this code makes more sense.
But when the duplicates in data1 column 23 has been removed, I have the data I want as the end gold (and more columns included which is not needed).
Sub uniquevalues()
Dim data1 As Variant, data2 As Variant
Dim lastRowTRB As Long, lastRowdata1 As Long
Dim timer As Double
Dim i As Long, k As Long
lastRowTRB = Worksheets("TimeRegistrations_Billable").Cells(Rows.count, "A").End(xlUp).row
data1 = Worksheets("TimeRegistrations_Billable").Range("A1:X" & lastRowTRB).Value
For i = 2 To lastRowTRB
If i > UBound(data1, 1) Then Exit For
data1(i, 23) = data1(i, 11) & data1(i, 16) & data1(i, 17) & data1(i, 20) & data1(i, 21) & data1(i, 22)
Next i
data2 = data1
For i = 2 To lastRowTRB
If i > UBound(data1, 1) Then Exit For
timer = 0
For k = 2 To lastRowTRB
If k > UBound(data2, 1) Then Exit For
If data2(i, 23) = data2(k, 11) & data2(k, 16) & data2(k, 17) & data2(k, 20) & data2(k, 21) & data2(k, 22) Then
timer = timer + data1(k, 15)
End If
Next k
data1(i, 24) = timer
Next i
With ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.count))
.Name = "Tester"
.Range("A1").Resize(lastRowTRB, 24).Value = data1
End With
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论