String Manipulation in Numba Cuda: Clip first k characters from a string array, k comes from another array

huangapple go评论71阅读模式
英文:

String Manipulation in Numba Cuda: Clip first k characters from a string array, k comes from another array

问题

我们有两个数组 arr1(包含字符串元素)和 arr2(包含整数)。我想要从 arr[i] 中剪切前 arr2[i] 个字符。这些数组非常大,因此我想要在 Numba cuda 中实现这个操作。Python 实现如下:

arr1 = ['abc', 'def', 'xyz']
arr2 = [1, 2, 3]

def python_clipper(arr1, arr2):
    for i in range(len(arr1)):
        arr1[i] = arr1[i][arr2[i]:]
    return arr1

print(python_clipper(arr1, arr2)) # ['bc', 'f', '']

上面的实现运行正常。但当我尝试将这个 Python 函数创建成一个 cuda 函数时,如下所示:

@cuda.jit()
def cuda_clipper(arr1, arr2):
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]

blockspergrid, threadsperblock = len(arr1), 1
cuda_clipper[blockspergrid, threadsperblock](arr1, arr2)
print(arr1)

我得到以下错误:

numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function _empty_string at 0x7f0456884d30>) found for signature:
 
 >>> _empty_string(int64, int64, bool)
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function 'register_jitable.<locals>.wrap.<locals>.ov_wrap': File: numba/core/extending.py: Line 159.
        With argument(s): '(int64, int64, bool)':
       Rejected as the implementation raised a specific error:
         NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
       NRT required but not enabled
       During: lowering "s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)" at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
  raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19

During: resolving callee type: Function(<function _empty_string at 0x7f0456884d30>)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)

File "../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py", line 1700:
            def getitem_slice(s, idx):
                <source elided>
                    # It's heterogeneous in kind OR stride != 1
                    ret = _empty_string(kind, span, is_ascii)
                    ^

During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)

File "string_numba.py", line 143:
def cuda_clipper(arr1, arr2):
    <source elided>
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
    ^

我认为问题在于切片字符串,因为类似的实现对于数组是正常的。我已经尝试将 arr1 转换为数组的数组,但这个预处理本身需要一些时间,使 cuda 无法提高性能。如何直接在 numba 中处理 str,而不是设法规避问题?

英文:

We have two arrays arr1 (which has string elements) and arr2 (which has integers).
I want to clip first arr2[i] characters from arr[i]. These arrays are very large and so I want to implement this in Numba cuda. Pythonic implementation is as follows:

arr1 = ['abc', 'def', 'xyz']
arr2 = [1,2,3]

def python_clipper(arr1,arr2):
    for i in range(len(arr1)):
        arr1[i] = arr1[i][arr2[i]:]
    return arr1

print(python_clipper(arr1,arr2)) # ['bc', 'f', '']

The above implementation works fine. But when I create a cuda function out of this python function like so:

@cuda.jit()
def cuda_clipper(arr1,arr2):
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]

blockspergrid, threadsperblock = len(arr1),1
cuda_clipper[blockspergrid, threadsperblock](arr1,arr2) # ['bc', 'f', '']
print(arr1)

I get the following error:

numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function _empty_string at 0x7f0456884d30>) found for signature:
 
 >>> _empty_string(int64, int64, bool)
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function 'register_jitable.<locals>.wrap.<locals>.ov_wrap': File: numba/core/extending.py: Line 159.
        With argument(s): '(int64, int64, bool)':
       Rejected as the implementation raised a specific error:
         NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
       NRT required but not enabled
       During: lowering "s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)" at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
  raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19

During: resolving callee type: Function(<function _empty_string at 0x7f0456884d30>)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)


File "../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py", line 1700:
            def getitem_slice(s, idx):
                <source elided>
                    # It's heterogeneous in kind OR stride != 1
                    ret = _empty_string(kind, span, is_ascii)
                    ^

During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)

File "string_numba.py", line 143:
def cuda_clipper(arr1,arr2):
    <source elided>
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
    ^

I am under the impression that slicing the string is the problem as a similar implementation works fine with an array. I have tried to make the arr1 into an array of array, but that preprocess itself takes some time rendering cuda useless to improve the performance. How can I directly work with str within numba rather than thinking of circumventing the problem.

答案1

得分: 4

> 我们有两个数组 arr1(其中包含字符串元素)和 arr2(其中包含整数)

你没有数组,你有列表。正如你可以从文档中看到,GPU 上没有 Python 字符串或列表支持。

因此,你目前在 Numba CUDA 中尝试做的事情是不受支持的。

英文:

> We have two arrays arr1 (which has string elements) and arr2 (which has integers)

You don't have arrays. You have lists. As you can see from the documentation, there is no python string or list support on the GPU.

Therefore what you are trying to do is currently not supported in Numba CUDA.

huangapple
  • 本文由 发表于 2023年4月10日 23:40:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/75978517.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定