英文:
Inlining a rvar to a var stage in Halide
问题
我试图将rvar内联到常规var阶段。
假设我有一个输入向量,它具有某个大小,我想将每个元素乘以2,同时将所有元素相加。
类似于:
Input<Buffer<int>> vector {"vector", 1};
Output<Buffer<int>> output {"output", 1};
Output<Func> sum_elements {"sum_elements", Int(32), 0};
Var i;
RDom r(0, vector.length());
output(i) = 2 * vector(i);
sum_elements() = 0;
sum_elements() += vector(r.x);
我想知道是否可能安排sum_elements
在output
上计算,即使不是最优的:
output.compute_root(); // 或者任何其他调度形式
// 在r.x和i之间创建链接,然后:
sum_elements.update().compute_with(output, i);
这样等效的C代码将是:
for (int i = 0; i < vector.length(); ++i)
{
output(i) = 2 * vector(i);
sum_elements() += vector(i);
}
请注意,这些代码示例是用C++代码编写的,可能需要根据你的编程环境进行适当的调整。
英文:
I’m trying to inline a rvar to a regular var stage.
Let’s say I have an input vector of some size, which I would like to multiply each element by 2, as well as sum all of its elements.
something like :
Input<Buffer<int>> vector {“vector”, 1};
Output<Buffer<int>> output {“output”, 1}
Output<Func> sum_elements {“sum_elements”, Int(32), 0};
Var i;
RDom r(0, vector.length());
output(i) = 2 * vector(i);
sum_elements() = 0;
sum_elements() += vector(r.x);
I would like to know if its possible to schedule sum_elements to be computed along output, even if not optimal:
output.compute_root(); // or any other form of schedule
// somehow create a link between r.x and i, and then:
sum_elements.update().compute_with(output, i);
so the equivalent c would be:
for (int i = 0; i < vector.length(); ++i)
{
output(i) = 2 * vector (i);
sum_elements() += vector (i);
}
答案1
得分: 1
使用RDom
还定义output
的另一种可接受的方法如下:
Input<Buffer<int>> vector {“vector”, 1};
Output<Buffer<int>> output {“output”, 1}
Output<Func> sum_elements {“sum_elements”, Int(32), 0};
Var i;
RDom r(0, vector.length());
output(i) = 0;
output(r.x) = 2 * vector(r.x) + 0 * output(r.x - 1);
sum_elements() = 0;
sum_elements() += vector(r.x);
output.compute_root();
sum_elements.compute_root();
sum_elements.update().compute_with(output.update(), r.x);
请注意,如果没有无操作的 0 * output(r.x - 1)
,上述代码将失败,并显示以下错误消息:
and sum_elements.s1(r5$x is ImpureRVar) do not match.```
可能有更优雅的方法来解决这个问题,但这个解决方法在编译过程中会被有效地移除。不幸的是,我对`PureRVar`和`ImpureRVar`之间的区别了解有限。
<details>
<summary>英文:</summary>
An alternative approach that might be acceptable is to use the `RDom` to also define the `output`:
```cpp
Input<Buffer<int>> vector {“vector”, 1};
Output<Buffer<int>> output {“output”, 1}
Output<Func> sum_elements {“sum_elements”, Int(32), 0};
Var i;
RDom r(0, vector.length());
output(i) = 0;
output(r.x) = 2 * vector(r.x) + 0 * output(r.x - 1);
sum_elements() = 0;
sum_elements() += vector(r.x);
output.compute_root();
sum_elements.compute_root();
sum_elements.update().compute_with(output.update(), r.x);
Note that without the no-op 0 * output(r.x - 1)
, the above would fail with the following error message:
Invalid compute_with: types of dim 0 of output.s1(r5$x is PureRVar)
and sum_elements.s1(r5$x is ImpureRVar) do not match.
There is probably a more elegant way to fix this, but the workaround is effectively removed during compilation. It seems to turn the use of the RVar
into something impure (my knowledge of the difference between PureRVar
and ImpureRVar
is limited, unfortunately).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论