英文:
Does WebGL/GLSL intermediate variables improve performances with no downsides?
问题
Sure, here's the translated content:
在使用至少2次以上时,中间变量是否能够在没有任何不利影响的情况下系统地提高性能?
让我们以一个实际的例子来说明:
vec4 right = doubleUV + vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 left = doubleUV - vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 up = doubleUV + vec4(0.0, size.y, 0.0, 2.0 * size.y);
vec4 down = doubleUV - vec4(0.0, size.y, 0.0, 2.0 * size.y);
我们可以看到 2.0 * size.x
和 2.0 * size.y
被多次使用。使用中间变量使我可以直观地重构这段代码:
float sizex2 = 2.0 * size.x;
vec4 right = doubleUV + vec4(size.x, 0.0, sizex2, 0.0);
vec4 left = doubleUV - vec4(size.x, 0.0, sizex2, 0.0);
float sizey2 = 2.0 * size.y;
vec4 up = doubleUV + vec4(0.0, size.y, 0.0, sizey2);
vec4 down = doubleUV - vec4(0.0, size.y, 0.0, sizey2);
除了可读性之外,这是否是一个良好的“不用思考”的性能实践,可以系统地应用?还是我应该考虑额外的乘法与变量分配之间的性能成本?
作为一个附加问题:额外的临时变量会损害性能吗?这很难测试,因为GLSL代码用于WebGL,并将由各种编译器编译。是否有一些GLSL编译器足够智能,可以将多余的小代码段组合在一起?
英文:
When use at least 2+ times, does an intermediate variable systematically improve performances with NO downsides?
Let's have a pragmatic example:
vec4 right = doubleUV + vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 left = doubleUV - vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 up = doubleUV + vec4(0.0, size.y, 0.0, 2.0 * size.y);
vec4 down = doubleUV - vec4(0.0, size.y, 0.0, 2.0 * size.y);
We can see that 2.0 * size.x
and 2.0 * size.y
and used multiple times. Using intermediate variable make me intuitively refactor this code as:
float sizex2 = 2.0 * size.x;
vec4 right = doubleUV + vec4(size.x, 0.0, sizex2, 0.0);
vec4 left = doubleUV - vec4(size.x, 0.0, sizex2, 0.0);
float sizey2 = 2.0 * size.y;
vec4 up = doubleUV + vec4(0.0, size.y, 0.0, sizey2);
vec4 down = doubleUV - vec4(0.0, size.y, 0.0, sizey2);
Besides readability, is this a good "no brainer" performance practice that could be systematically applied? Or should I think about performance cost of an extra multiplication vs variable allocation?
As a side question: May extra temporary variables hurt performances? This is hard to test as the GLSL code is intended for WebGL and will be compiled by a large variety of compilers. Are some GLSL compilers smart enough to group redundant small pieces of code?
答案1
得分: 2
以下是您要翻译的内容:
对于简单的重复乘法,临时变量不会提高性能。但是,在进行更复杂的操作(如倒数、平方、平方根、点积)时,引入额外的变量可能会对性能产生明显影响(对于不会进行优化的编译器而言)。
我不会担心插入临时变量的性能,因为它们将存储在GPU矢量化寄存器中。
但是,如果添加的变量多于寄存器数量,存在以下风险:
- 在某些GPU上编译着色器时失败。
- 寄存器溢出,需要直接从GPU执行单元本地内存中存储和访问变量(ARM移动GPU文档)。
作为替代方案(至少对我来说更清晰的方式):
vec4 v1 = vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 right = doubleUV + v1;
vec4 left = doubleUV - v1;
vec4 v2 = vec4(0.0, size.y, 0.0, 2.0 * size.y);
vec4 up = doubleUV + v2;
vec4 down = doubleUV - v2;
这里 您可以看到GGX着色模型以及分组到变量中的操作的复杂性。
英文:
For a simple repeated multiplication a temporary variable is not going to improve performance. But when doing more complex operations (like reciprocals, squares, square roots, dot products) introducing extra variables can have a noticeable impact (for the compilers that will not optimize it).
I would not worry about the performance of inserting temporary variables as those will be stored into the GPU vectorized registers.
But if you add more variables than there are registers you risk either:
- failing the compilation of the shader on some GPUs.
- register spilling, where the variables need to be stored and accessed directly from the GPU execution unit local memory (ARM mobile GPU docs).
As an alternative (clearer to me at least) way:
vec4 v1 = vec4(size.x, 0.0, 2.0 * size.x, 0.0);
vec4 right = doubleUV + v1;
vec4 left = doubleUV - v1;
vec4 v2 = vec4(0.0, size.y, 0.0, 2.0 * size.y);
vec4 up = doubleUV + v2;
vec4 down = doubleUV - v2;
Here you can see the GGX shading model and the complexity of the operations that are grouped into variables.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论