英文:
Why is Java faster than C for matrix multiplication?
问题
我在Java和C中编写了用于矩阵乘法的代码,针对巨大尺寸的矩阵(如2000x2000),并进行了基准测试。我无法理解为什么Java代码比C代码执行得更快...
以下是结果:
**矩阵1大小:** 1000 x 500; **矩阵2大小:** 500 x 1000
Java所用时间:13787 毫秒
C所用时间:20565 毫秒
**矩阵1大小:** 1000 x 1000; **矩阵2大小:** 1000 x 1500
Java所用时间:64636 毫秒
C所用时间:65155 毫秒
以下是我编写的代码:
在C中:
```c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
int main()
{
// 声明变量
long i, j, k, m1, n1, m2, n2;
// 读取矩阵的维度
printf("输入矩阵1的行维度:\t "); scanf("%ld", &m1);
printf("输入矩阵1的列维度:\t "); scanf("%ld", &n1);
printf("\n输入矩阵2的行维度:\t "); scanf("%ld", &m2);
printf("输入矩阵2的列维度:\t "); scanf("%ld", &n2);
// 时钟使用的变量
clock_t start, end;
double cpu_time_used;
// 矩阵相乘的必要条件
if (n1 != m2)
{
printf("\n无法进行乘法运算!\n");
printf("请检查矩阵的维度!\n\n");
}
else
{
char choice;
start = clock(); // 启动时钟
long (*mat1)[n1] = malloc(m1 * sizeof *mat1); // 在堆内存中存储矩阵
if (mat1 == NULL) exit(1); // 一旦矩阵为NULL,释放空间
long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
if (mat2 == NULL) exit(1);
long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
if (mat3 == NULL) exit(1);
// 生成具有随机元素的矩阵
for(i=0; i<m1; i++)
for(j=0; j<n1; j++)
mat1[i][j] = (long)rand()%MAX;
// 生成具有随机元素的矩阵
for(i=0; i<m2; i++)
for(j=0; j<n2; j++)
mat2[i][j] = (long)rand()%MAX;
printf("\n矩阵已生成!\n");
// 用零初始化结果矩阵
for(i=0; i<m1; i++)
for(j=0; j<n2; j++)
mat3[i][j] = 0;
// 执行 mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i < m1; i++)
for (j = 0; j < n2; j++)
for (k = 0; k < m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
// 在进程完成后释放矩阵占用的空间
free(mat1); free(mat2); free(mat3);
end = clock(); // 停止时钟计时
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; // 程序所需时间
printf("\n乘法运算花费 %f 毫秒执行。\n\n", cpu_time_used*1000);
}
return 0;
}
在Java中:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Matrix {
private static String s;
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// 读取矩阵1的维度
System.out.print("输入矩阵1的行维度:\t ");
int m1 = Integer.parseInt(br.readLine());
System.out.print("输入矩阵1的列维度:\t ");
int n1 = Integer.parseInt(br.readLine());
// 读取矩阵2的维度
System.out.print("\n输入矩阵2的行维度:\t ");
int m2 = Integer.parseInt(br.readLine());
System.out.print("输入矩阵2的列维度:\t ");
int n2 = Integer.parseInt(br.readLine());
// 如果不满足条件,则显示错误消息
if(n1 != m2) {
System.out.println("\n无法进行乘法运算!");
System.out.println("请检查矩阵的维度!\n");
}
else {
long start = System.currentTimeMillis();
// 声明矩阵
int[][] matrix1 = new int[m1][n1];
int[][] matrix2 = new int[m2][n2];
// 生成随机矩阵
generateRandom(matrix1);
generateRandom(matrix2);
System.out.println("\n矩阵已生成!\n");
// 执行矩阵1 x 矩阵2 = 结果
int[][] result = new int[m1][n2];
for (int i = 0; i < m1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = "执行时间为 " + (end - start);
System.out.print(s + "
<details>
<summary>英文:</summary>
I wrote codes for multiplying matrices of huge sizes (such as 2000x2000) in Java and C to benchmark them as an activity. I cannot understand why Java code executes faster than C code...
*Here are the results:*
**Matrix 1 size:** 1000 x 500; **Matrix 2 size:** 500 x 1000
Time Taken by Java : 13787 ms
Time Taken by C: 20565 ms
**Matrix 1 size:** 1000 x 1000; **Matrix 2 size:** 1000 x 1500
Time Taken by Java : 64636 ms
Time Taken by C: 65155 ms
*Here are the codes I wrote:*
In C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
int main()
{
// Declaring the variables
long i, j, k, m1, n1, m2, n2;
// Reading the dimensions of the matrices
printf("Enter the row dimension of the matrix 1:\t "); scanf("%ld", &m1);
printf("Enter the column dimension of the matrix 1:\t "); scanf("%ld", &n1);
printf("\nEnter the row dimension of the matrix 2:\t "); scanf("%ld", &m2);
printf("Enter the column dimension of the matrix 2:\t "); scanf("%ld", &n2);
// Variables used by the clock
clock_t start, end;
double cpu_time_used;
// Necessary condition for matrix multiplication
if (n1 != m2)
{
printf("\nMuliplication is not possible!\n");
printf("Please check the dimensions of the matrices!\n\n");
}
else
{
char choice;
start = clock(); // Starts the clock
long (*mat1)[n1] = malloc(m1 * sizeof *mat1); // Storing the matrix in heap memory
if (mat1 == NULL) exit(1); // Freeing up space once matrix is NULL
long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
if (mat2 == NULL) exit(1);
long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
if (mat3 == NULL) exit(1);
// Generating matrix with random elements
for(i=0; i<m1; i++)
for(j=0; j<n1; j++)
mat1[i][j] = (long)rand()%MAX;
// Generating matrix with random elements
for(i=0; i<m2; i++)
for(j=0; j<n2; j++)
mat2[i][j] = (long)rand()%MAX;
printf("\nMatrices Generated!\n");
// Initializing the result matrix with zeroes
for(i=0; i<m1; i++)
for(j=0; j<n2; j++)
mat3[i][j] = 0;
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i < m1; i++)
for (j = 0; j < n2; j++)
for (k = 0; k < m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
end = clock(); // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; // Time taken by the program
printf("\nMultiplication took %f milliseconds to execute.\n\n", cpu_time_used*1000);
}
return 0;
}
In Java:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Matrix {
private static String s;
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// Reading the dimensions of matrix 1
System.out.print("Enter the row dimension of the matrix 1:\t ");
int m1 = Integer.parseInt(br.readLine());
System.out.print("Enter the column dimension of the matrix 1:\t ");
int n1 = Integer.parseInt(br.readLine());
// Reading the dimensions of matrix 2
System.out.print("\nEnter the row dimension of the matrix 2:\t ");
int m2 = Integer.parseInt(br.readLine());
System.out.print("Enter the column dimension of the matrix 2:\t ");
int n2 = Integer.parseInt(br.readLine());
// Print error message if condition not satisfied
if(n1 != m2) {
System.out.println("\nMuliplication is not possible!");
System.out.println("Please check the dimensions of the matrices!\n");
}
else {
long start = System.currentTimeMillis();
// Declaring matrices
int[][] matrix1 = new int[m1][n1];
int[][] matrix2 = new int[m2][n2];
// Generate random matrices
generateRandom(matrix1);
generateRandom(matrix2);
System.out.println("\nMatrices Generated!\n");
// Performs matrix1 x matrix2 = result
int[][] result = new int[m1][n2];
for (int i = 0; i < m1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = "Execution time is " + (end - start);
System.out.print(s + " milliseconds.\n\n");
}
}
private static void displayMatrix(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i < r; i++) {
for (int j = 0; j < c; j++) {
System.out.print(matrix[i][j] + " ");
}
System.out.println();
}
}
// Generates Random Matrices
private static void generateRandom(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i < r; i++) {
for (int j = 0; j < c; j++) {
// Gives numbers in range (0,10)
matrix[i][j] = (int) (Math.random() * 10);
}
}
}
}
I'm running these on Windows 10, with Latest Versions of MinGW and JDK.
</details>
# 答案1
**得分**: 2
你正在将橙子与装满苹果的篮子进行比较。你甚至有没有考虑到`long`可能是*64*位,而Java的`int`是32位。
另一个原因可能是你没有要求C编译器优化代码。Java的*JIT*编译器会实时优化代码,而默认情况下,GCC进行**快速编译**以生成**调试方便**但**代码较慢**的代码。这里没有JIT优化。从未有人对未经优化的代码进行基准测试,就好比在20英里每小时的速度限制下测试法拉利一样。
在没有任何优化的情况下,我得到Java需要3216毫秒,C需要7670毫秒来处理1000x1000的矩阵。使用-O3参数后,C的运行时间降至2225毫秒。
**但这仍然是橙子与装满苹果的篮子之间的比较。当我将Java改为使用`long[][]`时,时间变为8300毫秒。**
<details>
<summary>英文:</summary>
You're comparing oranges to baskets of apples. Did you even take into account that `long` could be *64* bits whereas Java `int` is 32 bits.
Another reason could be that you didn't ask your C compiler to optimize the code. Java *JIT* compiler optimizes the code in time, whereas by *default* GCC does **fast compilation** producing **slow code** for **easy debugging**. There is no JIT optimization. No one **ever** benchmarks unoptimized code, it is as if testing a Ferrari at 20 mph speed limit.
Without any optimizations, I get 3216 milliseconds for Java, 7670 for C for 1000x1000 matrices. With -O3 C running time drops to 2225 ms.
**But it is still oranges to baskets of apples comparison. After I change Java to use `long[][]` then it takes 8300 milliseconds.**
</details>
# 答案2
**得分**: 0
以下是修改后的C和Java代码,仅保留了矩阵乘法部分的计时:
对于C:
```c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
int main()
{
// ...(略去前面的代码)
else
{
// ...(略去前面的代码)
// Initializing the result matrix with zeroes
for(i=0; i<m1; i++)
for(j=0; j<n2; j++)
mat3[i][j] = 0;
start = clock(); // Starts the clock
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i < m1; i++)
for (j = 0; j < n2; j++)
for (k = 0; k < m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
end = clock(); // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; // Time taken by the program
printf("\nMultiplication took %f milliseconds to execute.\n\n", cpu_time_used*1000);
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
}
return 0;
}
对于Java:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Matrix {
private static String s;
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// ...(略去前面的代码)
else {
// ...(略去前面的代码)
long start = System.currentTimeMillis();
// Performs matrix1 x matrix2 = result
int[][] result = new int[m1][n2];
for (int i = 0; i < m1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = "Execution time is " + (end - start);
System.out.print(s + " milliseconds.\n\n");
}
}
// ...(略去中间的代码)
}
请注意,我已将前面的代码略去,只保留了矩阵乘法部分的计时。如果您有任何其他问题或需要进一步的帮助,请告诉我。
英文:
You should compare only multiplication part if you want to find out the matrix multiplication time only ... not its creation, generation, multiplication and deletion. Change your code as
For C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
int main()
{
// Declaring the variables
long i, j, k, m1, n1, m2, n2;
// Reading the dimensions of the matrices
printf("Enter the row dimension of the matrix 1:\t "); scanf("%ld", &m1);
printf("Enter the column dimension of the matrix 1:\t "); scanf("%ld", &n1);
printf("\nEnter the row dimension of the matrix 2:\t "); scanf("%ld", &m2);
printf("Enter the column dimension of the matrix 2:\t "); scanf("%ld", &n2);
// Variables used by the clock
clock_t start, end;
double cpu_time_used;
// Necessary condition for matrix multiplication
if (n1 != m2)
{
printf("\nMuliplication is not possible!\n");
printf("Please check the dimensions of the matrices!\n\n");
}
else
{
char choice;
long (*mat1)[n1] = malloc(m1 * sizeof *mat1); // Storing the matrix in heap memory
if (mat1 == NULL) exit(1); // Freeing up space once matrix is NULL
long (*mat2)[n2] = malloc(m2 * sizeof *mat2);
if (mat2 == NULL) exit(1);
long (*mat3)[n2] = malloc(m1 * sizeof *mat3);
if (mat3 == NULL) exit(1);
// Generating matrix with random elements
for(i=0; i<m1; i++)
for(j=0; j<n1; j++)
mat1[i][j] = (long)rand()%MAX;
// Generating matrix with random elements
for(i=0; i<m2; i++)
for(j=0; j<n2; j++)
mat2[i][j] = (long)rand()%MAX;
printf("\nMatrices Generated!\n");
// Initializing the result matrix with zeroes
for(i=0; i<m1; i++)
for(j=0; j<n2; j++)
mat3[i][j] = 0;
start = clock(); // Starts the clock
// Performing mat1[m1][n1] x mat2[m2][n2] = mat3[m1][n2]
for (i = 0; i < m1; i++)
for (j = 0; j < n2; j++)
for (k = 0; k < m2; k++)
mat3[i][j] += mat1[i][k] * mat2[k][j];
end = clock(); // Stop the clock timer
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; // Time taken by the program
printf("\nMultiplication took %f milliseconds to execute.\n\n", cpu_time_used*1000);
// Freeing up space occupied by the matrices after process completion
free(mat1); free(mat2); free(mat3);
}
return 0;
}
For Java
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Matrix {
private static String s;
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
// Reading the dimensions of matrix 1
System.out.print("Enter the row dimension of the matrix 1:\t ");
int m1 = Integer.parseInt(br.readLine());
System.out.print("Enter the column dimension of the matrix 1:\t ");
int n1 = Integer.parseInt(br.readLine());
// Reading the dimensions of matrix 2
System.out.print("\nEnter the row dimension of the matrix 2:\t ");
int m2 = Integer.parseInt(br.readLine());
System.out.print("Enter the column dimension of the matrix 2:\t ");
int n2 = Integer.parseInt(br.readLine());
// Print error message if condition not satisfied
if(n1 != m2) {
System.out.println("\nMuliplication is not possible!");
System.out.println("Please check the dimensions of the matrices!\n");
}
else {
// Declaring matrices
int[][] matrix1 = new int[m1][n1];
int[][] matrix2 = new int[m2][n2];
// Generate random matrices
generateRandom(matrix1);
generateRandom(matrix2);
System.out.println("\nMatrices Generated!\n");
long start = System.currentTimeMillis();
// Performs matrix1 x matrix2 = result
int[][] result = new int[m1][n2];
for (int i = 0; i < m1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < n1; k++) {
result[i][j] += matrix1[i][k] * matrix2[k][j];
}
}
}
long end = System.currentTimeMillis();
s = "Execution time is " + (end - start);
System.out.print(s + " milliseconds.\n\n");
}
}
private static void displayMatrix(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i < r; i++) {
for (int j = 0; j < c; j++) {
System.out.print(matrix[i][j] + " ");
}
System.out.println();
}
}
// Generates Random Matrices
private static void generateRandom(int[][] matrix) {
int r = matrix.length;
int c = matrix[0].length;
for (int i = 0; i < r; i++) {
for (int j = 0; j < c; j++) {
// Gives numbers in range (0,10)
matrix[i][j] = (int) (Math.random() * 10);
}
}
}
}
I have put start and stop time just for matrix multiplication
答案3
得分: 0
据我所知,在Java中没有free
(或delete
)关键字 - 当对象不再被引用时,垃圾收集器会负责进行“清理”。具体发生的时间我们无法确定。因此,你的Java代码与C代码并不完全相同,因为C代码中有一个显式的free
调用。如果C代码在计时器运行时没有调用free
,比较可能会更加“公平”。
还有一些注意事项:
-
对于
mat3
,请使用calloc
,并删除初始化mat3
的循环。 -
删除在用于计时的代码中的所有打印输出(无论是C还是Java)。
至少对于C代码,在停止计时器后,你还应该打印一些结果。这是为了防止一个非常聪明的编译器将所有的矩阵操作都优化为“无操作”。但请注意,如果你先调用了free
,就无法这样做。
顺便说一下:我认为你已经默认使用了编译C代码时打开了优化选项(例如,-O3) - 如果没有的话,现在是时候了。
英文:
As far as I know there is no free
(or delete
) in Java - it's left to a garbage collector to "clean up" when an object is no longer referenced. Exactly when that happens, we can't tell. Consequently, your Java code isn't doing exactly the same as your C code as the C code has an explicit free
call. Maybe the compare would be "more fair" if the C code didn't call free
while the timer is running.
Some other notes:
-
Use
calloc
format3
and remove the loop that initializesmat3
-
Remove all prints in the code that you are using for time measurement (in both C and Java)
At least for the C code, you should also add some prints of the result after stopping the timer. This is to prevent a real clever compiler from optimizing all the matrix stuff to "nothing". But notice that you can't do that if you call free
first.
BTW: I take it for granted that you compile the C code with optimization turned on (e.g. -O3) - if not, it's about time.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论