2023年4月10日 23:40:46go评论103阅读模式

英文:

String Manipulation in Numba Cuda: Clip first k characters from a string array, k comes from another array

问题

我们有两个数组 arr1（包含字符串元素）和 arr2（包含整数）。我想要从 arr[i] 中剪切前 arr2[i] 个字符。这些数组非常大，因此我想要在 Numba cuda 中实现这个操作。Python 实现如下：

arr1 = ['abc', 'def', 'xyz']
arr2 = [1, 2, 3]
def python_clipper(arr1, arr2):
    for i in range(len(arr1)):
        arr1[i] = arr1[i][arr2[i]:]
    return arr1
print(python_clipper(arr1, arr2)) # ['bc', 'f', '']

上面的实现运行正常。但当我尝试将这个 Python 函数创建成一个 cuda 函数时，如下所示：

@cuda.jit()
def cuda_clipper(arr1, arr2):
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
blockspergrid, threadsperblock = len(arr1), 1
cuda_clipper[blockspergrid, threadsperblock](arr1, arr2)
print(arr1)

我得到以下错误：

numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(&lt;function _empty_string at 0x7f0456884d30&gt;) found for signature:
 
 &gt;&gt;&gt; _empty_string(int64, int64, bool)
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function &#39;register_jitable.&lt;locals&gt;.wrap.&lt;locals&gt;.ov_wrap&#39;: File: numba/core/extending.py: Line 159.
        With argument(s): &#39;(int64, int64, bool)&#39;:
       Rejected as the implementation raised a specific error:
         NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
       NRT required but not enabled
       During: lowering &quot;s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)&quot; at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
  raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19
During: resolving callee type: Function(&lt;function _empty_string at 0x7f0456884d30&gt;)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)
File &quot;../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py&quot;, line 1700:
            def getitem_slice(s, idx):
                &lt;source elided&gt;
                    # It&#39;s heterogeneous in kind OR stride != 1
                    ret = _empty_string(kind, span, is_ascii)
                    ^
During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)
File &quot;string_numba.py&quot;, line 143:
def cuda_clipper(arr1, arr2):
    &lt;source elided&gt;
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
    ^

我认为问题在于切片字符串，因为类似的实现对于数组是正常的。我已经尝试将 arr1 转换为数组的数组，但这个预处理本身需要一些时间，使 cuda 无法提高性能。如何直接在 numba 中处理 str，而不是设法规避问题？

英文:

We have two arrays arr1 (which has string elements) and arr2 (which has integers).
I want to clip first arr2[i] characters from arr[i]. These arrays are very large and so I want to implement this in Numba cuda. Pythonic implementation is as follows:

arr1 = [&#39;abc&#39;, &#39;def&#39;, &#39;xyz&#39;]
arr2 = [1,2,3]
def python_clipper(arr1,arr2):
    for i in range(len(arr1)):
        arr1[i] = arr1[i][arr2[i]:]
    return arr1
print(python_clipper(arr1,arr2)) # [&#39;bc&#39;, &#39;f&#39;, &#39;&#39;]

The above implementation works fine. But when I create a cuda function out of this python function like so:

@cuda.jit()
def cuda_clipper(arr1,arr2):
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
blockspergrid, threadsperblock = len(arr1),1
cuda_clipper[blockspergrid, threadsperblock](arr1,arr2) # [&#39;bc&#39;, &#39;f&#39;, &#39;&#39;]
print(arr1)

I get the following error:

numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(&lt;function _empty_string at 0x7f0456884d30&gt;) found for signature:
 
 &gt;&gt;&gt; _empty_string(int64, int64, bool)
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function &#39;register_jitable.&lt;locals&gt;.wrap.&lt;locals&gt;.ov_wrap&#39;: File: numba/core/extending.py: Line 159.
        With argument(s): &#39;(int64, int64, bool)&#39;:
       Rejected as the implementation raised a specific error:
         NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
       NRT required but not enabled
       During: lowering &quot;s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)&quot; at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
  raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19
During: resolving callee type: Function(&lt;function _empty_string at 0x7f0456884d30&gt;)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)
File &quot;../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py&quot;, line 1700:
            def getitem_slice(s, idx):
                &lt;source elided&gt;
                    # It&#39;s heterogeneous in kind OR stride != 1
                    ret = _empty_string(kind, span, is_ascii)
                    ^
During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)
File &quot;string_numba.py&quot;, line 143:
def cuda_clipper(arr1,arr2):
    &lt;source elided&gt;
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
    ^

I am under the impression that slicing the string is the problem as a similar implementation works fine with an array. I have tried to make the arr1 into an array of array, but that preprocess itself takes some time rendering cuda useless to improve the performance. How can I directly work with str within numba rather than thinking of circumventing the problem.

答案1

得分: 4

> 我们有两个数组 arr1（其中包含字符串元素）和 arr2（其中包含整数）

你没有数组，你有列表。正如你可以从文档中看到，GPU 上没有 Python 字符串或列表支持。

因此，你目前在 Numba CUDA 中尝试做的事情是不受支持的。

英文:

> We have two arrays arr1 (which has string elements) and arr2 (which has integers)

You don't have arrays. You have lists. As you can see from the documentation, there is no python string or list support on the GPU.

Therefore what you are trying to do is currently not supported in Numba CUDA.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

String Manipulation in Numba Cuda: Clip first k characters from a string array, k comes from another array

问题

答案1

如何融化数据框并列出列下的单词？

scikit-learn示例中图例中的标记丢失

“`python 使用 websocket-client 建立与 elevenlabs 的 WebSocket 连接 “`

argparse在Python类中的验证

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。