问题

我有一个循环，其中有两个计数器：i 和 j。如果它们具有相同的值 - 迭代速度比它们的值不同的情况下要快得多：

基准测试                模式    计数       分数            误差      单位
FloatsArrayBenchmark.times   thrpt   20  341805.800 ± 1623.320  ops/s
FloatsArrayBenchmark.times2  thrpt   20  198764.909 ± 1608.387  ops/s

Java 字节码是相同的，这意味着它与一些较低级别的优化有关。有人可以解释为什么会发生这种情况吗？以下是基准测试的代码：

import org.openjdk.jmh.annotations.*;

public class FloatsArrayBenchmark {
    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(new String[]{FloatsArrayBenchmark.class.getSimpleName()});
    }

    @Benchmark @Fork(value = 1, warmups = 0)
    public void times(Data data) {
        float[] result = new float[10000];
        for (int i = 0, j = 0; i < 9_999; i++, j++)
            result[j] = data.floats[i] * 10;
    }

    @Benchmark @Fork(value = 1, warmups = 0)
    public void times2(Data data) {
        float[] result = new float[10000];
        for (int i = 0, j = 1; i < 9_999; i++, j++)
            result[j] = data.floats[i] * 10;
    }

    @State(Scope.Benchmark)
    public static class Data {
        private final float[] floats = new float[10000];
    }
}

环境：

MacOS，尝试过 Java8、Java11、Java14
2.4 GHz 四核 Intel Core i5

英文:

I have a loop with 2 counters: i and j. If they have the same value - iteration works much faster than if their values differ:

Benchmark                     Mode  Cnt       Score      Error  Units
FloatsArrayBenchmark.times   thrpt   20  341805.800 &#177; 1623.320  ops/s
FloatsArrayBenchmark.times2  thrpt   20  198764.909 &#177; 1608.387  ops/s

Java bytecode is identical, which means it's related to some lower level optimizations. Can someone explain why this is happening? Here's the benchmark:

import org.openjdk.jmh.annotations.*;

public class FloatsArrayBenchmark {
    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(new String[]{FloatsArrayBenchmark.class.getSimpleName()});
    }

    @Benchmark @Fork(value = 1, warmups = 0)
    public void times(Data data) {
        float[] result = new float[10000];;
        for (int i = 0, j=0; i &lt; 9_999; i++,j++)
            result[j] = data.floats[i] * 10;
    }
    @Benchmark @Fork(value = 1, warmups = 0)
    public void times2(Data data) {
        float[] result = new float[10000];
        for (int i = 0,j=1; i &lt; 9_999; i++,j++)
            result[j] = data.floats[i] * 10;
    }

    @State(Scope.Benchmark)
    public static class Data {
        private final float[] floats = new float[10000];
    }
}

Environment:

MacOS, tried Java8, Java11, Java14
2,4 GHz Quad-Core Intel Core i5

答案1

得分: 3

在第一个（更快）版本中，i 总是（实际上）与 j 具有相同的值，因此它：

public void times(Data data) {
    float[] result = new float[10000];;
    for (int i=0, j=0; i < 9_999; i++,j++)
        result[j] = data.floats[i] * 10;
}

可以重新编写而不使用 j，效果相同：

public void times(Data data) {
    float[] result = new float[10000];;
    for (int i = 0; i < 9_999; i++)
        result[i] = data.floats[i] * 10;
}

很可能编译器认识到 j 是多余的并将其消除，从而减少了执行的 ++ 操作数量，这占了所有算术操作的 1/3。这与计时一致：第二个版本每次迭代花费的时间更长约 70%。70% 大约是 50%（3:2 操作比率）的结果。

英文:

In the first (faster) version, i always (effectively) has the same value as j, so it:

public void times(Data data) {
    float[] result = new float[10000];;
    for (int i=0, j=0; i &lt; 9_999; i++,j++)
        result[j] = data.floats[i] * 10;
}

can be re-written without j with identical effect:

public void times(Data data) {
    float[] result = new float[10000];;
    for (int i = 0; i &lt; 9_999; i++)
        result[i] = data.floats[i] * 10;
}

It is likely that the compiler recognised thatj is redundant and eliminated it, resulting in half the number of ++ operations performed, which accounts for 1/3 of all aritmetic operations. This is consistent with the timings: the second version takes 70% longer per iteration. 70% is approxiately 50%, the result expected for a ratio of 3:2 operations.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

循环性能：具有相同值与不同值的计数器

问题

答案1

抑制特定依赖中 JAR 文件的 OWASP 发现结果

Java 月度重复发生

自定义声音推送通知

java将int数组转换为Set时未能成功使用Collectors.toSet()

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论