英文:
String Manipulation in Numba Cuda: Clip first k characters from a string array, k comes from another array
问题
我们有两个数组 arr1
(包含字符串元素)和 arr2
(包含整数)。我想要从 arr[i]
中剪切前 arr2[i]
个字符。这些数组非常大,因此我想要在 Numba
cuda 中实现这个操作。Python 实现如下:
arr1 = ['abc', 'def', 'xyz']
arr2 = [1, 2, 3]
def python_clipper(arr1, arr2):
for i in range(len(arr1)):
arr1[i] = arr1[i][arr2[i]:]
return arr1
print(python_clipper(arr1, arr2)) # ['bc', 'f', '']
上面的实现运行正常。但当我尝试将这个 Python 函数创建成一个 cuda
函数时,如下所示:
@cuda.jit()
def cuda_clipper(arr1, arr2):
i = cuda.grid(1)
arr1[i] = arr1[i][arr2[i]:]
blockspergrid, threadsperblock = len(arr1), 1
cuda_clipper[blockspergrid, threadsperblock](arr1, arr2)
print(arr1)
我得到以下错误:
numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function _empty_string at 0x7f0456884d30>) found for signature:
>>> _empty_string(int64, int64, bool)
There are 2 candidate implementations:
- Of which 2 did not match due to:
Overload in function 'register_jitable.<locals>.wrap.<locals>.ov_wrap': File: numba/core/extending.py: Line 159.
With argument(s): '(int64, int64, bool)':
Rejected as the implementation raised a specific error:
NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
NRT required but not enabled
During: lowering "s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)" at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19
During: resolving callee type: Function(<function _empty_string at 0x7f0456884d30>)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)
File "../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py", line 1700:
def getitem_slice(s, idx):
<source elided>
# It's heterogeneous in kind OR stride != 1
ret = _empty_string(kind, span, is_ascii)
^
During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)
File "string_numba.py", line 143:
def cuda_clipper(arr1, arr2):
<source elided>
i = cuda.grid(1)
arr1[i] = arr1[i][arr2[i]:]
^
我认为问题在于切片字符串,因为类似的实现对于数组是正常的。我已经尝试将 arr1
转换为数组的数组,但这个预处理本身需要一些时间,使 cuda
无法提高性能。如何直接在 numba
中处理 str
,而不是设法规避问题?
英文:
We have two arrays arr1
(which has string elements) and arr2
(which has integers).
I want to clip first arr2[i]
characters from arr[i]
. These arrays are very large and so I want to implement this in Numba
cuda. Pythonic implementation is as follows:
arr1 = ['abc', 'def', 'xyz']
arr2 = [1,2,3]
def python_clipper(arr1,arr2):
for i in range(len(arr1)):
arr1[i] = arr1[i][arr2[i]:]
return arr1
print(python_clipper(arr1,arr2)) # ['bc', 'f', '']
The above implementation works fine. But when I create a cuda
function out of this python function like so:
@cuda.jit()
def cuda_clipper(arr1,arr2):
i = cuda.grid(1)
arr1[i] = arr1[i][arr2[i]:]
blockspergrid, threadsperblock = len(arr1),1
cuda_clipper[blockspergrid, threadsperblock](arr1,arr2) # ['bc', 'f', '']
print(arr1)
I get the following error:
numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function _empty_string at 0x7f0456884d30>) found for signature:
>>> _empty_string(int64, int64, bool)
There are 2 candidate implementations:
- Of which 2 did not match due to:
Overload in function 'register_jitable.<locals>.wrap.<locals>.ov_wrap': File: numba/core/extending.py: Line 159.
With argument(s): '(int64, int64, bool)':
Rejected as the implementation raised a specific error:
NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
NRT required but not enabled
During: lowering "s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)" at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19
During: resolving callee type: Function(<function _empty_string at 0x7f0456884d30>)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)
File "../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py", line 1700:
def getitem_slice(s, idx):
<source elided>
# It's heterogeneous in kind OR stride != 1
ret = _empty_string(kind, span, is_ascii)
^
During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)
File "string_numba.py", line 143:
def cuda_clipper(arr1,arr2):
<source elided>
i = cuda.grid(1)
arr1[i] = arr1[i][arr2[i]:]
^
I am under the impression that slicing the string is the problem as a similar implementation works fine with an array. I have tried to make the arr1
into an array of array, but that preprocess itself takes some time rendering cuda
useless to improve the performance. How can I directly work with str
within numba
rather than thinking of circumventing the problem.
答案1
得分: 4
> 我们有两个数组 arr1
(其中包含字符串元素)和 arr2
(其中包含整数)
你没有数组,你有列表。正如你可以从文档中看到,GPU 上没有 Python 字符串或列表支持。
因此,你目前在 Numba CUDA 中尝试做的事情是不受支持的。
英文:
> We have two arrays arr1
(which has string elements) and arr2
(which has integers)
You don't have arrays. You have lists. As you can see from the documentation, there is no python string or list support on the GPU.
Therefore what you are trying to do is currently not supported in Numba CUDA.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论