为什么 Java 在矩阵乘法方面比 C 速度更快?

huangapple go评论68阅读模式
英文:

Why is Java faster than C for matrix multiplication?

问题

我在Java和C中编写了用于矩阵乘法的代码,针对巨大尺寸的矩阵(如2000x2000),并进行了基准测试。我无法理解为什么Java代码比C代码执行得更快...

以下是结果:

**矩阵1大小:** 1000 x 500; **矩阵2大小:** 500 x 1000

Java所用时间:13787 毫秒
C所用时间:20565 毫秒


**矩阵1大小:** 1000 x 1000; **矩阵2大小:** 1000 x 1500

Java所用时间:64636 毫秒
C所用时间:65155 毫秒


以下是我编写的代码:

在C中:

```c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10

int main()
{
    // 声明变量
	long i, j, k, m1, n1, m2, n2;

    // 读取矩阵的维度
    printf("输入矩阵1的行维度:\t "); scanf("%ld", &m1);
    printf("输入矩阵1的列维度:\t "); scanf("%ld", &n1);
    printf("\n输入矩阵2的行维度:\t "); scanf("%ld", &m2);
    printf("输入矩阵2的列维度:\t "); scanf("%ld", &n2);

    // 时钟使用的变量
    clock_t start, end;
    double cpu_time_used;

    // 矩阵相乘的必要条件
    if (n1 != m2)
    {
        printf("\n无法进行乘法运算!\n");
        printf("请检查矩阵的维度!\n\n");
    }

    else
    {
        char choice;
        start = clock();                                    // 启动时钟

        long (*mat1)[n1] = malloc(m1 * sizeof *mat1);       // 在堆内存中存储矩阵
        if (mat1 == NULL) exit(1);                          // 一旦矩阵为NULL,释放空间
            
        long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
        if (mat2 == NULL) exit(1);
            
        long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
        if (mat3 == NULL) exit(1);

        // 生成具有随机元素的矩阵
        for(i=0; i<m1; i++)
            for(j=0; j<n1; j++)
                mat1[i][j] = (long)rand()%MAX;

        // 生成具有随机元素的矩阵
        for(i=0; i<m2; i++)
            for(j=0; j<n2; j++)
                mat2[i][j] = (long)rand()%MAX;

	printf("\n矩阵已生成!\n");

        // 用零初始化结果矩阵        
        for(i=0; i<m1; i++)
            for(j=0; j<n2; j++)        
                mat3[i][j] = 0;

        // 执行 mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
        for (i = 0; i < m1; i++)
            for (j = 0; j < n2; j++)
                for (k = 0; k < m2; k++)
                    mat3[i][j] += mat1[i][k] * mat2[k][j];
        
        // 在进程完成后释放矩阵占用的空间
        free(mat1); free(mat2); free(mat3);

        end = clock();                                                  // 停止时钟计时
        cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;      // 程序所需时间
        printf("\n乘法运算花费 %f 毫秒执行。\n\n", cpu_time_used*1000);
    }

	return 0;
}

在Java中:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class Matrix {
    private static String s;

    public static void main(String[] args) throws IOException {
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // 读取矩阵1的维度
        System.out.print("输入矩阵1的行维度:\t ");
        int m1 = Integer.parseInt(br.readLine());
        System.out.print("输入矩阵1的列维度:\t ");
        int n1 = Integer.parseInt(br.readLine());

        // 读取矩阵2的维度
        System.out.print("\n输入矩阵2的行维度:\t ");
        int m2 = Integer.parseInt(br.readLine());
        System.out.print("输入矩阵2的列维度:\t ");
        int n2 = Integer.parseInt(br.readLine());

        // 如果不满足条件,则显示错误消息
        if(n1 != m2) {
            System.out.println("\n无法进行乘法运算!");
            System.out.println("请检查矩阵的维度!\n");
        }

        else {
            long start = System.currentTimeMillis();
            
            // 声明矩阵
            int[][] matrix1 = new int[m1][n1];
            int[][] matrix2 = new int[m2][n2];

            // 生成随机矩阵
            generateRandom(matrix1);
            generateRandom(matrix2);

            System.out.println("\n矩阵已生成!\n");

            // 执行矩阵1 x 矩阵2 = 结果
            int[][] result = new int[m1][n2];
            for (int i = 0; i < m1; i++) {
                for (int j = 0; j < n2; j++) {
                    for (int k = 0; k < n1; k++) {
                        result[i][j] += matrix1[i][k] * matrix2[k][j];
                    }
                }
            }
            
            long end = System.currentTimeMillis();
            s = "执行时间为 " + (end - start);
            System.out.print(s + "

<details>
<summary>英文:</summary>

I wrote codes for multiplying matrices of huge sizes (such as 2000x2000) in Java and C to benchmark them as an activity. I cannot understand why Java code executes faster than C code...

*Here are the results:*

**Matrix 1 size:** 1000 x 500; **Matrix 2 size:** 500 x 1000

Time Taken by Java : 13787 ms
Time Taken by C: 20565 ms


**Matrix 1 size:** 1000 x 1000; **Matrix 2 size:** 1000 x 1500

Time Taken by Java : 64636 ms
Time Taken by C: 65155 ms


*Here are the codes I wrote:*
In C:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10

int main()
{
// Declaring the variables
long i, j, k, m1, n1, m2, n2;

// Reading the dimensions of the matrices
printf(&quot;Enter the row dimension of the matrix 1:\t &quot;); scanf(&quot;%ld&quot;, &amp;m1);
printf(&quot;Enter the column dimension of the matrix 1:\t &quot;); scanf(&quot;%ld&quot;, &amp;n1);
printf(&quot;\nEnter the row dimension of the matrix 2:\t &quot;); scanf(&quot;%ld&quot;, &amp;m2);
printf(&quot;Enter the column dimension of the matrix 2:\t &quot;); scanf(&quot;%ld&quot;, &amp;n2);
// Variables used by the clock
clock_t start, end;
double cpu_time_used;
// Necessary condition for matrix multiplication
if (n1 != m2)
{
printf(&quot;\nMuliplication is not possible!\n&quot;);
printf(&quot;Please check the dimensions of the matrices!\n\n&quot;);
}
else
{
char choice;
start = clock();                                    // Starts the clock
long (*mat1)[n1] = malloc(m1 * sizeof *mat1);       // Storing the matrix in heap memory
if (mat1 == NULL) exit(1);                          // Freeing up space once matrix is NULL
long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
if (mat2 == NULL) exit(1);
long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
if (mat3 == NULL) exit(1);
// Generating matrix with random elements
for(i=0; i&lt;m1; i++)
for(j=0; j&lt;n1; j++)
mat1[i][j] = (long)rand()%MAX;
// Generating matrix with random elements
for(i=0; i&lt;m2; i++)
for(j=0; j&lt;n2; j++)
mat2[i][j] = (long)rand()%MAX;
printf(&quot;\nMatrices Generated!\n&quot;);
// Initializing the result matrix with zeroes        
for(i=0; i&lt;m1; i++)
for(j=0; j&lt;n2; j++)        
mat3[i][j] = 0;
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i &lt; m1; i++)
for (j = 0; j &lt; n2; j++)
for (k = 0; k &lt; m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
end = clock();                                                  // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;      // Time taken by the program
printf(&quot;\nMultiplication took %f milliseconds to execute.\n\n&quot;, cpu_time_used*1000);
}
return 0;

}


In Java:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class Matrix {
private static String s;

public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// Reading the dimensions of matrix 1
System.out.print(&quot;Enter the row dimension of the matrix 1:\t &quot;);
int m1 = Integer.parseInt(br.readLine());
System.out.print(&quot;Enter the column dimension of the matrix 1:\t &quot;);
int n1 = Integer.parseInt(br.readLine());
// Reading the dimensions of matrix 2
System.out.print(&quot;\nEnter the row dimension of the matrix 2:\t &quot;);
int m2 = Integer.parseInt(br.readLine());
System.out.print(&quot;Enter the column dimension of the matrix 2:\t &quot;);
int n2 = Integer.parseInt(br.readLine());
// Print error message if condition not satisfied
if(n1 != m2) {
System.out.println(&quot;\nMuliplication is not possible!&quot;);
System.out.println(&quot;Please check the dimensions of the matrices!\n&quot;);
}
else {
long start = System.currentTimeMillis();
// Declaring matrices
int[][] matrix1 = new int[m1][n1];
int[][] matrix2 = new int[m2][n2];
// Generate random matrices
generateRandom(matrix1);
generateRandom(matrix2);
System.out.println(&quot;\nMatrices Generated!\n&quot;);
// Performs matrix1 x matrix2 = result
int[][] result = new int[m1][n2];
for (int i = 0; i &lt; m1; i++) {
for (int j = 0; j &lt; n2; j++) {
for (int k = 0; k &lt; n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = &quot;Execution time is &quot; + (end - start);
System.out.print(s + &quot; milliseconds.\n\n&quot;);
}
}
private static void displayMatrix(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i &lt; r; i++) {
for (int j = 0; j &lt; c; j++) {
System.out.print(matrix[i][j] + &quot; &quot;);
}
System.out.println();
}
}
// Generates Random Matrices
private static void generateRandom(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i &lt; r; i++) {
for (int j = 0; j &lt; c; j++) {
// Gives numbers in range (0,10)
matrix[i][j] = (int) (Math.random() * 10);
}
}
}

}


I&#39;m running these on Windows 10, with Latest Versions of MinGW and JDK.
</details>
# 答案1
**得分**: 2
你正在将橙子与装满苹果的篮子进行比较。你甚至有没有考虑到`long`可能是*64*位,而Java的`int`是32位。
另一个原因可能是你没有要求C编译器优化代码。Java的*JIT*编译器会实时优化代码,而默认情况下,GCC进行**快速编译**以生成**调试方便**但**代码较慢**的代码。这里没有JIT优化。从未有人对未经优化的代码进行基准测试,就好比在20英里每小时的速度限制下测试法拉利一样。
在没有任何优化的情况下,我得到Java需要3216毫秒,C需要7670毫秒来处理1000x1000的矩阵。使用-O3参数后,C的运行时间降至2225毫秒。
**但这仍然是橙子与装满苹果的篮子之间的比较。当我将Java改为使用`long[][]`时,时间变为8300毫秒。**
<details>
<summary>英文:</summary>
You&#39;re comparing oranges to baskets of apples. Did you even take into account that `long` could be *64* bits whereas Java `int` is 32 bits.
Another reason could be that you didn&#39;t ask your C compiler to optimize the code. Java *JIT* compiler optimizes the code in time, whereas by *default* GCC does **fast compilation** producing **slow code** for **easy debugging**. There is no JIT optimization. No one **ever** benchmarks unoptimized code, it is as if testing a Ferrari at 20 mph speed limit.
Without any optimizations, I get 3216 milliseconds for Java, 7670 for C for 1000x1000 matrices. With -O3 C running time drops to 2225 ms.
**But it is still oranges to baskets of apples comparison. After I change Java to use `long[][]` then it takes 8300 milliseconds.** 
</details>
# 答案2
**得分**: 0
以下是修改后的C和Java代码,仅保留了矩阵乘法部分的计时:
对于C:
```c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
int main()
{
// ...(略去前面的代码)
else
{
// ...(略去前面的代码)
// Initializing the result matrix with zeroes        
for(i=0; i<m1; i++)
for(j=0; j<n2; j++)        
mat3[i][j] = 0;
start = clock();                                    // Starts the clock
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i < m1; i++)
for (j = 0; j < n2; j++)
for (k = 0; k < m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
end = clock();                                                  // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;      // Time taken by the program
printf("\nMultiplication took %f milliseconds to execute.\n\n", cpu_time_used*1000);
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
}
return 0;
}

对于Java:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class Matrix {
    private static String s;

    public static void main(String[] args) throws IOException {
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // ...(略去前面的代码)

        else {
            // ...(略去前面的代码)

            long start = System.currentTimeMillis();
            // Performs matrix1 x matrix2 = result
            int[][] result = new int[m1][n2];
            for (int i = 0; i < m1; i++) {
                for (int j = 0; j < n2; j++) {
                    for (int k = 0; k < n1; k++) {
                        result[i][j] += matrix1[i][k] * matrix2[k][j];
                    }
                }
            }
            
            long end = System.currentTimeMillis();
            s = "Execution time is " + (end - start);
            System.out.print(s + " milliseconds.\n\n");
        }
    }

    // ...(略去中间的代码)
}

请注意,我已将前面的代码略去,只保留了矩阵乘法部分的计时。如果您有任何其他问题或需要进一步的帮助,请告诉我。

英文:

You should compare only multiplication part if you want to find out the matrix multiplication time only ... not its creation, generation, multiplication and deletion. Change your code as

For C

#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;time.h&gt;
#define MAX 10
int main()
{
// Declaring the variables
long i, j, k, m1, n1, m2, n2;
// Reading the dimensions of the matrices
printf(&quot;Enter the row dimension of the matrix 1:\t &quot;); scanf(&quot;%ld&quot;, &amp;m1);
printf(&quot;Enter the column dimension of the matrix 1:\t &quot;); scanf(&quot;%ld&quot;, &amp;n1);
printf(&quot;\nEnter the row dimension of the matrix 2:\t &quot;); scanf(&quot;%ld&quot;, &amp;m2);
printf(&quot;Enter the column dimension of the matrix 2:\t &quot;); scanf(&quot;%ld&quot;, &amp;n2);
// Variables used by the clock
clock_t start, end;
double cpu_time_used;
// Necessary condition for matrix multiplication
if (n1 != m2)
{
printf(&quot;\nMuliplication is not possible!\n&quot;);
printf(&quot;Please check the dimensions of the matrices!\n\n&quot;);
}
else
{
char choice;
long (*mat1)[n1] = malloc(m1 * sizeof *mat1);       // Storing the matrix in heap memory
if (mat1 == NULL) exit(1);                          // Freeing up space once matrix is NULL
long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
if (mat2 == NULL) exit(1);
long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
if (mat3 == NULL) exit(1);
// Generating matrix with random elements
for(i=0; i&lt;m1; i++)
for(j=0; j&lt;n1; j++)
mat1[i][j] = (long)rand()%MAX;
// Generating matrix with random elements
for(i=0; i&lt;m2; i++)
for(j=0; j&lt;n2; j++)
mat2[i][j] = (long)rand()%MAX;
printf(&quot;\nMatrices Generated!\n&quot;);
// Initializing the result matrix with zeroes        
for(i=0; i&lt;m1; i++)
for(j=0; j&lt;n2; j++)        
mat3[i][j] = 0;
start = clock();                                    // Starts the clock
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i &lt; m1; i++)
for (j = 0; j &lt; n2; j++)
for (k = 0; k &lt; m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
end = clock();                                                  // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;      // Time taken by the program
printf(&quot;\nMultiplication took %f milliseconds to execute.\n\n&quot;, cpu_time_used*1000);
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
}
return 0;
}

For Java

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Matrix {
private static String s;
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// Reading the dimensions of matrix 1
System.out.print(&quot;Enter the row dimension of the matrix 1:\t &quot;);
int m1 = Integer.parseInt(br.readLine());
System.out.print(&quot;Enter the column dimension of the matrix 1:\t &quot;);
int n1 = Integer.parseInt(br.readLine());
// Reading the dimensions of matrix 2
System.out.print(&quot;\nEnter the row dimension of the matrix 2:\t &quot;);
int m2 = Integer.parseInt(br.readLine());
System.out.print(&quot;Enter the column dimension of the matrix 2:\t &quot;);
int n2 = Integer.parseInt(br.readLine());
// Print error message if condition not satisfied
if(n1 != m2) {
System.out.println(&quot;\nMuliplication is not possible!&quot;);
System.out.println(&quot;Please check the dimensions of the matrices!\n&quot;);
}
else {
// Declaring matrices
int[][] matrix1 = new int[m1][n1];
int[][] matrix2 = new int[m2][n2];
// Generate random matrices
generateRandom(matrix1);
generateRandom(matrix2);
System.out.println(&quot;\nMatrices Generated!\n&quot;);
long start = System.currentTimeMillis();
// Performs matrix1 x matrix2 = result
int[][] result = new int[m1][n2];
for (int i = 0; i &lt; m1; i++) {
for (int j = 0; j &lt; n2; j++) {
for (int k = 0; k &lt; n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = &quot;Execution time is &quot; + (end - start);
System.out.print(s + &quot; milliseconds.\n\n&quot;);
}
}
private static void displayMatrix(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i &lt; r; i++) {
for (int j = 0; j &lt; c; j++) {
System.out.print(matrix[i][j] + &quot; &quot;);
}
System.out.println();
}
}
// Generates Random Matrices
private static void generateRandom(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i &lt; r; i++) {
for (int j = 0; j &lt; c; j++) {
// Gives numbers in range (0,10)
matrix[i][j] = (int) (Math.random() * 10);
}
}
}
}

I have put start and stop time just for matrix multiplication

答案3

得分: 0

据我所知,在Java中没有free(或delete)关键字 - 当对象不再被引用时,垃圾收集器会负责进行“清理”。具体发生的时间我们无法确定。因此,你的Java代码与C代码并不完全相同,因为C代码中有一个显式的free调用。如果C代码在计时器运行时没有调用free,比较可能会更加“公平”。

还有一些注意事项:

  • 对于mat3,请使用calloc,并删除初始化mat3的循环。

  • 删除在用于计时的代码中的所有打印输出(无论是C还是Java)。

至少对于C代码,在停止计时器后,你还应该打印一些结果。这是为了防止一个非常聪明的编译器将所有的矩阵操作都优化为“无操作”。但请注意,如果你先调用了free,就无法这样做。

顺便说一下:我认为你已经默认使用了编译C代码时打开了优化选项(例如,-O3) - 如果没有的话,现在是时候了。

英文:

As far as I know there is no free (or delete) in Java - it's left to a garbage collector to "clean up" when an object is no longer referenced. Exactly when that happens, we can't tell. Consequently, your Java code isn't doing exactly the same as your C code as the C code has an explicit free call. Maybe the compare would be "more fair" if the C code didn't call free while the timer is running.

Some other notes:

  • Use calloc for mat3 and remove the loop that initializes mat3

  • Remove all prints in the code that you are using for time measurement (in both C and Java)

At least for the C code, you should also add some prints of the result after stopping the timer. This is to prevent a real clever compiler from optimizing all the matrix stuff to "nothing". But notice that you can't do that if you call free first.

BTW: I take it for granted that you compile the C code with optimization turned on (e.g. -O3) - if not, it's about time.

huangapple
  • 本文由 发表于 2020年10月7日 14:02:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/64238091.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定