调用 tensor.packed_accessor32() 会引发内存错误。

huangapple go评论65阅读模式
英文:

Calling tensor.packed_accessor32() throws memory error

问题

This issue is occurring because you are trying to create packed tensor accessors for vertices inside the measure_distance_cuda function, but the tensor vertices was originally created on the CPU (torch::kCUDA was not specified when creating it). To use GPU-specific operations like packed tensor accessors, the tensor needs to be on the GPU.

You can fix this by ensuring that vertices is on the GPU before creating the accessor. Here's the modified code:

at::Tensor measure_distance_cuda(at::Tensor vertices) {
    // Move vertices to the CUDA device
    vertices = vertices.to(torch::kCUDA);

    // Rest of your code remains the same
    // ...

    // Now, you can safely create packed tensor accessors for vertices
    at::PackedTensorAccessor32<float_t, 2> vert_acc = vertices.packed_accessor32<float_t, 2>();

    // ...

    return distances;
}

By moving vertices to the CUDA device using to(torch::kCUDA), you ensure that it's compatible with GPU operations, and you should no longer encounter the error when creating the accessor.

英文:

Problem Summary

Inside my main method, I create some tensors and pass them to the function measure_distance_cuda. From there, I try to create accessors to pass to a kernel that I've written (removed for minimal working example). However, when creating the accessors using tensor.packed_accessor&lt;&gt;() I get the following runtime error coming from TensorBase.h:

Exception has occurred: CPP/c10::Error
Unhandled exception at 0x00007FF8071ECF19 in cuda_test.exe: Microsoft C++ exception: c10::Error at memory location 0x0000004A42CFE4F0.

What I've tried:

My first thought was that memory errors are weird and can point to the wrong line, so I removed the call to the Cuda kernel that would actually use the accessors. So no indexing is occurring whatsoever. However, the error persists.

Minimal reproducible code

My main function:

#include &lt;iostream&gt;

#include &lt;ATen/ATen.h&gt;
#include &lt;torch/types.h&gt;

#include &quot;raycast_cuda.cuh&quot;

int main() {

	auto vert_options = at::TensorOptions().dtype(torch::kFloat64).device(torch::kCUDA);

	torch::Tensor vertices = torch::tensor(
		{{-1, 1, 0},
		 {1, 1, 0},
		 {-1, -1, 0}}, vert_options
	);

    at::Tensor distances = measure_distance_cuda(vertices);
	std::cout &lt;&lt; distances &lt;&lt; std::endl;
}

raycast_cuda.cu

#include &lt;cuda.h&gt;
#include &lt;cuda_runtime.h&gt;

#include &lt;ATen/ATen.h&gt;
#include &lt;torch/types.h&gt;

__host__
at::Tensor measure_distance_cuda(at::Tensor vertices) {

    // get return tensor and accessor ****NO ERROR HERE****
    at::TensorOptions return_tensor_options = at::TensorOptions().device(torch::kCUDA);
    at::Tensor distances = at::zeros({n_rays, n_faces}, return_tensor_options);
    at::PackedTensorAccessor32&lt;float_t, 2&gt; d_acc = distances.packed_accessor32&lt;float_t, 2&gt;();

    // get accessors for inputs ****ERROR HAPPENS HERE****
    at::PackedTensorAccessor32&lt;float_t, 2&gt; vert_acc = vertices.packed_accessor32&lt;float_t, 2&gt;()

    return distances;
}

Some thoughts:

  • I noted that creating an accessor for the return values (distances) gives me no issues. It's only angry at me for trying it on the tensors I passed into the function. So I'm suspicious that I'm doing something in the wrong scope.

Why is this happening?

答案1

得分: 1

我在PyTorch论坛上得到了一个快速答案... 答案很简单。我将我的输入声明为kFloat64,对应于double_t而不是float_t

auto vert_options = at::TensorOptions().dtype(torch::kFloat64).device(torch::kCUDA);

应该是

auto vert_options = at::TensorOptions().dtype(torch::kFloat32).device(torch::kCUDA);

这样我就可以调用

at::PackedTensorAccessor32&lt;float_t, 2&gt; vert_acc = vertices.packed_accessor32&lt;float_t, 2&gt;();

英文:

I got a quick answer on the PyTorch forums... The answer was simple. I'm declaring my inputs as kFloat64 which corresponds to double_t not float_t.

auto vert_options = at::TensorOptions().dtype(torch::kFloat64).device(torch::kCUDA);

should be

auto vert_options = at::TensorOptions().dtype(torch::kFloat32).device(torch::kCUDA);

so that I can call

at::PackedTensorAccessor32&lt;float_t, 2&gt; vert_acc = vertices.packed_accessor32&lt;float_t, 2&gt;();

huangapple
  • 本文由 发表于 2023年6月8日 02:40:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76426197.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定