如何在VB.net中减少计算时间?

huangapple go评论60阅读模式
英文:

How to reduce the time of calculations in VB.net?

问题

我必须对近5000个项目执行一些计算。我已经使用了并行for循环,这些计算所花费的时间几乎为5秒。我希望在1秒以内完成。任何建议都将非常感激。

以下是代码部分的翻译:

Dim AverageVal As New Double
Dim maxVal, minVal As New Double
Dim error_ As New Double
Dim delta As New List(Of Double)

With gOutputStresses_LC(0)

    Dim UniqueTagsList As List(Of Integer) = gOutputStresses_LC(0).stresses.Select(Function(c) c.Tag0D).Distinct.ToList

    Parallel.ForEach(UniqueTagsList, Sub(Uniq)
                 Dim tstresses As New List(Of clsMSHStress)

                 tstresses = .stresses.FindAll(Function(s) s.Tag0D = Uniq)
                 tstresses.Sort(Function(x, y) x.Sxx.CompareTo(y.Sxx))
                 tstresses.Reverse()
                 'maxVal = tstresses(0).Sxx
                 'minVal = tstresses.Last.Sxx

                 error_ = Math.Abs((tstresses(0).Sxx - tstresses.Last.Sxx) / (tstresses(0).Sxx + tstresses.Last.Sxx))

                 If error_ < 10 Then
                     AverageVal = tstresses.Sum(Function(s) s.Sxx) / tstresses.Count
                     .stresses.FindAll(Function(s) s.Tag0D = Uniq).ForEach(Sub(c) c.Sxx = AverageVal)
                 Else
                     AverageVal = 0
                     delta = New List(Of Double)
                     For k As Integer = 1 To tstresses.Count - 1
                          delta.Add(tstresses(k).Sxx - tstresses(k - 1).Sxx)
                     Next

                     Dim idx As Integer = delta.FindIndex(Function(c) c = delta.Max)
                     For g As Integer = 0 To idx - 1
                         AverageVal += tstresses(g).Sxx

                     Next
                     AverageVal = AverageVal / idx
                     For g As Integer = 0 To idx - 1
                         tstresses(g).Sxx = AverageVal
                     Next

                     AverageVal = 0
                     For g As Integer = idx To tstresses.Count - 1
                         AverageVal += tstresses(g).Sxx
                     Next
                     AverageVal = AverageVal / (tstresses.Count - idx)
                     For g As Integer = idx To tstresses.Count - 1
                         tstresses(g).Sxx = AverageVal
                     Next
                     For p As Integer = 0 To tstresses.Count - 1
                         ''.Stresses.Where(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D) = tstresses(p)
                         'Dim idx As Integer =
                         .stresses(.stresses.FindIndex(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D)) = tstresses(p)
                     Next
                 End If
              End Sub)
End With

UniqueTagsList 包含了近5000个项目。

英文:

I have to perform some calculations for almost 5000 items. I have used parallel for loop and the time taken for these calculations is almost 5 sec. I want to do it in below 1 sec. Any suggestions will be highly appreciated.
The code is this

Dim AverageVal As New Double
Dim maxVal, minVal As New Double
Dim error_ As New Double
Dim delta As New List(Of Double)
With gOutputStresses_LC(0)
Dim UniqueTagsList As List(Of Integer) = gOutputStresses_LC(0).stresses.Select(Function(c) c.Tag0D).Distinct.ToList
Parallel.ForEach(UniqueTagsList, Sub(Uniq)
Dim tstresses As New List(Of clsMSHStress)
tstresses = .stresses.FindAll(Function(s) s.Tag0D = Uniq)
tstresses.Sort(Function(x, y) x.Sxx.CompareTo(y.Sxx))
tstresses.Reverse()
'maxVal = tstresses(0).Sxx
'minVal = tstresses.Last.Sxx
error_ = Math.Abs((tstresses(0).Sxx - tstresses.Last.Sxx) / (tstresses(0).Sxx + tstresses.Last.Sxx))
If error_ < 10 Then
AverageVal = tstresses.Sum(Function(s) s.Sxx) / tstresses.Count
.stresses.FindAll(Function(s) s.Tag0D = Uniq).ForEach(Sub(c) c.Sxx = AverageVal)
Else
AverageVal = 0
delta = New List(Of Double)
For k As Integer = 1 To tstresses.Count - 1
delta.Add(tstresses(k).Sxx - tstresses(k - 1).Sxx)
Next
Dim idx As Integer = delta.FindIndex(Function(c) c = delta.Max)
For g As Integer = 0 To idx - 1
AverageVal += tstresses(g).Sxx
Next
AverageVal = AverageVal / idx
For g As Integer = 0 To idx - 1
tstresses(g).Sxx = AverageVal
Next
AverageVal = 0
For g As Integer = idx To tstresses.Count - 1
AverageVal += tstresses(g).Sxx
Next
AverageVal = AverageVal / (tstresses.Count - idx)
For g As Integer = idx To tstresses.Count - 1
tstresses(g).Sxx = AverageVal
Next
For p As Integer = 0 To tstresses.Count - 1
''.Stresses.Where(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D) = tstresses(p)
'Dim idx As Integer =
.stresses(.stresses.FindIndex(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D)) = tstresses(p)
Next
End If
End Sub)
End With    

The UniqueTagsList contains almost 5000 items.

答案1

得分: 2

5000 项对于算法来说根本不算什么,如果算法正确实现的话。我们可能可以在不需要并行的情况下在不到一秒的时间内完成。假设使用现代常见的2 GHz 的商品 CPU,那就是 20 亿个时钟周期。甚至更好的是,你可能会获得每个时钟周期至少 3 条指令。总共是 60 亿条指令。简单的除法意味着我们可以在单个非并行核上的一秒钟内完成这个操作,只要我们对每个项目的指令数少于 120 万。

从原始代码中可以看出,节省的一个重要潜力是这个循环(以及类似的循环):

它似乎试图将每个 tstresses 项目与原始的 gOutputStresses_LC 数组中的相同项目进行匹配。但由于它似乎我们正在使用引用类型(cls 前缀通常不用于 Structure),我们可以知道_这些已经是相同的对象_,并且不需要查找回原始数组。这样做只会浪费大量时间,将每个数组元素设置为它已经具有的相同值。

在其他地方也有类似的代码,运行了 FindIndex()FindAll() 来在先前的数组中进行额外查找。这额外的工作很重要且_不需要_,将本来可能接近于 O(n)(在排序之后)的东西转换为 O(n2)。

一旦我们理解了这一点,简化就会打开其他改进的途径。简而言之,通过避免创建额外不必要的列表/数组,并显著减少对数据的遍历次数,下面的代码应该会快得多。

我需要添加一个声明,我是直接在回复窗口中输入的,没有任何样本数据可以进行测试。很可能会有一两个错误,你可能仍然需要解决一些bug,包括围绕最终的 IdxAtMaxDelta 中断可能有的一个偏差,所以请务必进行彻底的测试。

Dim TagGroups = gOutputStresses_LC(0).stresses.GroupBy(Function(c) c.Tag0D)

For Each grp In TagGroups
    Dim sorted = grp.OrderByDesc(Function(i) i).ToList()
    Dim last = sorted(sorted.Count - 1)
    Dim error_ As Double = Math.Abs( (sorted(0).Sxx - last.Sxx) / 
                                     (sorted(0).Sxx + last.Sxx) )

    If error_ < 10.0 Then
        Dim AverageVal As Double = sorted.Sum(Function(s) s.Sxx) / sorted.Count
        For Each item In grp
           item.Sxx = AverageVal
        Next
    Else
        ' 只需要对数据进行 2 次循环,而且没有嵌套循环
        ' (原始代码使用了 4 个基本循环,其中有 2 个嵌套循环)

        Dim Total As Double = 0.0
        Dim idx As Integer = 0
        Dim prior As clsMSHStress = sorted(0)

        Dim MaxDelta As Integer = 0.0
        Dim IdxAtMaxDelta As Integer = 1
        Dim TotalAtMaxDelta As Double = 0.0

        ' 循环 1
        For Each item In sorted.Skip(1)
            total += item.Sxx
            idx += 1
            Dim delta As Double = item.Vss - prior.Vss
            prior = s
            If delta > MaxDelta Then
               MaxDelta = delta
               MaxIdx = idx
               TotalAtMaxDelta = total
            End If
        Next

        Dim lowAverage As Double = TotalAtMaxDelta / IdxAtMaxDelta
        Dim highAverage As Double = (Total - TotalATMaxDelta) / (sorted.Count - IdxAtMaxDelta)

        ' 循环 2(前半部分)
        Dim i As Integer = 0
        For i = 0 To IdxAtMaxDelta
            sorted(i).Vss = lowAverage
        Next
        ' 循环 2(后半部分)
        For i = i To sorted.Count - 1
            sorted(i).Vss = highAverage
        Next
    End If
Next
英文:

5000 items is nothing if the algorithm is done correctly. We can probably do this in less than a second without needing to go parallel at all. Given a modern commodity CPU clocked at a modest, say, 2 GHz, that's 2 billion clock cycles. Even better, you probably get at least 3 instructions per tick. That's 6 billion total. Simple division means we can fit this in a single second on a single non-parallel core as long as we're spending less than 1.2 million instructions per item. In fact, memory latency is probably our limiting factor here, and parallel execution might not even fix that.

A significant savings potential from the original is this loop (and similar):

For p As Integer = 0 To tstresses.Count - 1
    &#39;&#39;.Stresses.Where(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D) = tstresses(p)
    &#39;Dim idx As Integer =
    .stresses(.stresses.FindIndex(Function(s) s.Tag0D = tstresses(p).Tag0D And s.Tag2D = tstresses(p).Tag2D)) = tstresses(p)
Next

It seems to be trying to match up each tstresses item with the same item from the original gOutputStresses_LC array. But since it also appears we are working with reference types (the cls prefix is not typically used for Structure), we can know these are already the same object, and the lookups back into the original array are not needed. All this will ever do is waste a bunch of time setting each array element to the same value it already has.

There is similar code elsewhere running FindIndex() or FindAll() doing extra lookups back in an earlier array. This extra work is significant and not needed, converting something that would be close to O(n) (after the sort) to something that's instead O(n<sup>2</sup>).

Once we understand this, the simplification opens up other avenues for improvement as well. In short, the code below should be MUCH faster by avoiding the creation of extra needless Lists/arrays and by significantly reducing the number of passes through the data.

I need to add a disclaimer that I typed this directly into the reply window, and without the benefit of any sample data to test against. It's likely there's a bug or two you'll still need to work through, including a potential off-by-one error around the final IdxAtMaxDelta break, so be sure to test thoroughly.

Dim TagGroups = gOutputStresses_LC(0).stresses.GroupBy(Function(c) c.Tag0D)
For Each grp In TagGroups
Dim sorted = grp.OrderByDesc(Function(i) i).ToList()
Dim last = sorted(sorted.Count - 1)
Dim error_ As Double = Math.Abs( (sorted(0).Sxx - last.Sxx) / 
(sorted(0).Sxx + last.Sxx) )
If error_ &lt; 10.0 Then
Dim AverageVal As Double = sorted.Sum(Function(s) s.Sxx) / sorted.Count
For Each item In grp
item.Sxx = AverageVal
Next
Else
&#39; Only need 2 loops through the data and NO NESTED PASSES
&#39; (original code used 4 base passes, of which 2 had nested passes)
Dim Total As Double = 0.0
Dim idx As Integer = 0
Dim prior As clsMSHStress = sorted(0)
Dim MaxDelta As Integer = 0.0
Dim IdxAtMaxDelta As Integer = 1
Dim TotalAtMaxDelta As Double = 0.0
&#39; Loop 1
For Each item In sorted.Skip(1)
total += item.Sxx
idx += 1
Dim delta As Double = item.Vss - prior.Vss
prior = s
If delta &gt; MaxDelta Then
MaxDelta = delta
MaxIdx = idx
TotalAtMaxDelta = total
End If
Next
Dim lowAverage As Double = TotalAtMaxDelta / IdxAtMaxDelta
Dim highAverage As Double = (Total - TotalATMaxDelta) /
(sorted.Count - IdxAtMaxDelta)
&#39; Loop 2 (first half)
Dim i As Integer = 0
For i = 0 To IdxAtMaxDelta
sorted(i).Vss = lowAverage
Next
&#39; Loop 2 (second half)
For i = i To sorted.Count - 1
sorted(i).Vss = highAverage
Next
End If
Next

huangapple
  • 本文由 发表于 2023年6月15日 21:47:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76483158.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定