Java与C++在简单for循环上的速度对比

huangapple go评论75阅读模式
英文:

Java vs C++ speed on simple for loop

问题

#include <iostream>
#include <ctime>

int main() 
{ 
    long long x = 0;

    clock_t begin = clock();

    for (long long i = 0; i < 2147483647; i++)
        x += i;

    clock_t end = clock();

    double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
    std::cout << "Time elapsed: " << elapsed_secs << std::endl;
    std::cout << "x = " << x << std::endl;

    return 0; 
}
英文:

Why in simple for loop the same code in Java works 4 times faster than in C++? i.e. in Java this code completes in 700-800 ms and in C++ 4-5 SECONDS. Although C++ usually considered much faster than Java, especially with CPU-bound workloads. Have i lost sight of some important moment ???

Java:

import java.time.Duration;
import java.time.Instant;

public class Main {

    public static void main(String[] args) {

        long x = 0;

        Instant start = Instant.now();

        for (long i = 0; i &lt; 2147483647; i++)
            x += i;

        Instant end = Instant.now();

        Duration result = Duration.between(start, end);
        System.out.println(&quot;TIME: &quot; + result.toMillis());
        System.out.println(&quot;X = &quot; + x);
    }
}

Output:

TIME: 799
X = 2305843005992468481

C++:

#include &lt;iostream&gt;
#include &lt;ctime&gt;

int main() 
{ 
    long long x = 0;

    clock_t begin = clock();

    for (long long i = 0; i &lt; 2147483647; i++)
        x += i;

    clock_t end = clock();

    double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
    std::cout &lt;&lt; &quot;Time elapsed: &quot; &lt;&lt; elapsed_secs &lt;&lt; std::endl;
    std::cout &lt;&lt; &quot;x = &quot; &lt;&lt; x &lt;&lt; std::endl;
    
    return 0; 
}

Output:

Time elapsed: 4.59629
x = 2305843005992468481

答案1

得分: 5

C++在开启优化时可以运行得很快。默认情况下,编译时不会进行优化,因为编译C++需要时间。你需要使用`-O`标志,`-O2`应该是可以的。

你测量的循环是一种常见的模式,并且有一个直接的解决方案:

    int sum = 0;
    for (int i = 0; i < n; ++i)
        sum += i;
    // 这将得到相同的结果:
    sum = ((n+1) * n) / 2;

而且编译器知道这个技巧(它们可能不使用那个公式,因为对于`(n+1)*n`,可能会溢出,而最终结果不会溢出)。对于gcc,下面的代码:

    #include <iostream>
    
    int main(){
        long long x = 0;
        for (long long i = 0; i < 2147483647; ++i)
            x += i;
        std::cout << x;
    }

[翻译为](https://godbolt.org/z/EWE4rx):

    main:
            sub     rsp, 8
            mov     edi, OFFSET FLAT:_ZSt4cout
            movabs  rsi, 2305843005992468481
            call    std::basic_ostream<char, std::char_traits<char> > & std::basic_ostream<char, std::char_traits<char> >::_M_insert<long long>(long long)
            xor     eax, eax
            add     rsp, 8
            ret
    _GLOBAL__sub_I_main:
            sub     rsp, 8
            mov     edi, OFFSET FLAT:_ZStL8__ioinit
            call    std::ios_base::Init::Init() [complete object constructor]
            mov     edx, OFFSET FLAT:__dso_handle
            mov     esi, OFFSET FLAT:_ZStL8__ioinit
            mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
            add     rsp, 8
            jmp     __cxa_atexit

注意,这里没有循环!不像这样的情况需要几秒钟才能执行。

在这里查看当你关闭优化(默认情况)时会发生什么:https://godbolt.org/z/8KncEj
英文:

C++ can be fast when you turn on optimizations. The default is compiling without optimizations because compiling C++ takes time. You need to use the -O flag, -O2 should be ok.

The loop you measure is a common pattern and has a direct solution:

int sum = 0;
for (int i=0;i &lt; n; ++i) sum += i
// this will give the same result:
sum = ((n+1)*n)/2;

And compilers know about this trick (they probably don't use that formula, because it can overflow for (n+1)*n while the final result is no overflow). With gcc, this:

#include &lt;iostream&gt;

int main(){
    long long x = 0;
    for (long long i = 0; i &lt; 2147483647; ++i)
        x += i;
    std::cout &lt;&lt; x;
}

translates to:

main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZSt4cout
        movabs  rsi, 2305843005992468481
        call    std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;&amp; std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;::_M_insert&lt;long long&gt;(long long)
        xor     eax, eax
        add     rsp, 8
        ret
_GLOBAL__sub_I_main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

Note that there is no loop! Unlikely this takes seconds to execute.

See here what happens when you turn off optimiziations (the default): https://godbolt.org/z/8KncEj

huangapple
  • 本文由 发表于 2020年10月14日 15:56:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/64348944.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定