英文:
Inheritance vs code generation for better performance?
问题
以下是已翻译的内容:
我正在尝试从一个类似以下的代码中获得最佳性能:
interface DbStream {
void writeInt(int x);
void writeString(String s);
// 等等,大约有20种不同类型
}
interface Writer {
void write(DbStream stream, Object value);
}
Writer[] writers = new Writer[NUM_COLS];
DbStream stream;
Object[][] src = new Object[NUM_ROWS][NUM_COLS];
for (int row = 0; row < NUM_ROWS; row++) {
for(int col = 0; col < NUM_COLS; col++) {
writers[col].write(stream, src[row][col]);
}
}
`Writer`接口的每个实现都执行必要的转换并调用`DbStream`的适当方法。该代码使用了继承,因此这些调用不会被内联。如果将内部循环手动展开以包含200-300次对静态方法的调用,是否会提高性能?程序将使用JDK 13,如果有任何区别的话。
英文:
I'm trying to get max performance from a piece of code that looks somewhat like this
interface DbStream {
void writeInt(int x);
void writeString(String s);
// etc, somewhere around 20 different types
}
interface Writer {
void write(DbStream stream, Object value);
}
Writer[] writers = new Writer[NUM_COLS];
DbStream stream;
Object[][] src = new Object[NUM_ROWS][NUM_COLS];
for (int row = 0; row < NUM_ROWS; row++) {
for(int col = 0; col < NUM_COLS; col++) {
writers[col].write(stream, src[row][col]);
}
}
Each implementation of Writer
interface does necessary conversions and calls proper method of DbStream
. The code uses inheritance, so these calls aren't inlined. Will there be a performance improvement if the inner loop is manually unrolled to contain 200-300 calls to static methods? The program will use JDK 13, if it makes any difference.
答案1
得分: 3
我正在尝试从一段代码中获得最大的性能...
通常这是错误的方法。更好的方法是优化明显是你的应用性能瓶颈的代码。而且,“最大”不是一个好的目标。更好的目标是“足够好”。(在某种程度上,软件开发人员的时间比CPU时间更昂贵。而且它绝对是一种更为稀缺的资源!)
这是我建议你做的事情。
-
完成你的应用程序的功能并让它正常工作。
-
创建一个使用真实数据来测试这段代码的真实基准。
-
使用基准测试来测量你的应用程序在这段代码中花费的时间所占百分比。
-
估计你可以通过优化来获得的潜在性能。例如,如果内联这些调用可以提高这段代码的性能10%,而这段代码代表了总应用程序CPU时间的5%,那么你将从这个优化中获得总体CPU性能提升0.5%。
-
现在决定:
- 可能/可能的性能提升是否值得开发工作?
- 是否值得(假设的)对系统可维护性的影响?
-
如果是:进行优化并进行测量。
- 你是否实际达到了期望的性能?
- 对可维护性的“损害”是否值得?
(如果最小化CPU时间不是你的目标,请相应调整方法。例如,如果你想最小化请求时间,那么你还需要考虑后端数据库等所花费的时间。)
在这种情况下,我的直觉是你提出的优化可能只会带来小幅性能差异。手动内联调用可以减少每次调用几个(比如3或4个)机器指令。但我怀疑这对整体应用程序性能来说不会有显著影响。
英文:
> I'm trying to get max performance from a piece of code ...
That is typically the wrong approach. A better approach is to optimize the code that is demonstrably the performance bottleneck for your application. Also, "maximum" is not a good goal. A better goal is "good enough". (Up to a point, software developer time is more expensive than CPU time. And it is certainly a more scarce commodity!)
Here's what I recommend that you do.
-
Get your application feature complete and working.
-
Create a realistic benchmark that exercises this code using real data.
-
Profile the application running the benchmark to measure what percentage of time your application spends in this part of the code.
-
Estimate the potential performance you could get by optimizing. For example, if inlining these calls improves this code by 10%, and this code represents 5% of the total application CPU time, then you would get an overall CPU performance increase of 0.5% from this optimization.
-
Now decide:
- Is the possible / likely performance increase worth the development effort?
- Is it worth the (hypothetical) hit on system maintainability?
-
If yes: do the optimization and measure it.
- Did you actually achieve the performance you expected?
- Was the "damage" to maintainability worth it?
(If minimizing CPU time is not your goal, adjust the methodology accordingly. For example, if you want to minimize request times, then you also need to take account of the time taken by the backend database, etcetera.)
In this case, my gut feeling is that your proposed optimization would probably make a small performance difference. Manually inlining the calls could shave off a few (say 3 or 4) machine instructions per call. However, I doubt that it would be significant to the overall application performance.
答案2
得分: 0
The JDK规范对性能没有特定的承诺,也没有规定这两个概念必须以何种方式运行,这是一个明显的性能缺陷。
换句话说:取决于JDK版本、版本、架构、操作系统和月相。
使用类似JMH的工具来获得实际结果,尽量在尽可能接近实际“现实生活”情况的情况下运行它(在相同的硬件上以类似的负载运行它,并确保您测试的代码尽可能相似)。
鉴于这段代码似乎是“写入数据库”的代码,我敢打赌这些都不重要,因为数据库,或者是围绕它的基础设施(比如生成用于将它们发送到数据库的TCP数据包,即使它在本地运行并且是内存中的实现),将比所有这些代码慢几个数量级。我预测您的JMH结果将非常接近(足够接近,以至于您观察到的任何差异都是统计噪音)。
英文:
The JDK specs make no particular promise about performance, nor do they proscribe a way in which these two concepts must function which is an obvious performance hole.
In other words: Depends on JDK version, edition, architecture, OS, and the phase of the moon.
Use something like JMH to get an actual result, run it as close to the actual 'real life' circumstances as you can manage (run it on the same hardware under a similar load and make sure the code you test is as similar as you can make it).
Given that this code seems to 'write to a DB', I bet none of it will matter, as the DB, or, alternatively the infrastructure around it (such as making TCP packets to send them to the DB, even if it is running locally and is an in-memory implementation), will be multiple orders of magnitude slower than any of this. I predict your JMH result will be very close (close enough that whatever difference you observe is statistical noise).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论