2023年5月28日 16:48:43go评论102阅读模式

英文:

NumPy in Cython: compile-time type vs original type

问题

在Cython中，存在与NumPy相对应的编译时类型。似乎编译时类型比原始类型更快。如果我们将它们与C类型结合使用，例如，有三个关键字可用于定义整数类型：int、np.int、np.int_t。

在官方教程中，这三种类型都被使用了。这让我感到困惑。以下是我的问题：

使用唯一的数据类型来实现更好的性能是否正确？还有，我应该选择哪种类型？
如果使用唯一的数据类型不正确，那么在程序的不同部分如何确定使用哪种类型？

英文:

In Cython, there are corresponding compile-time types to NumPy. It seems that the compile-time type are faster than the original type. If we combine them with the C type, for example, there are three keywords that can be used to define a integer type: int, np.int, np.int_t.

In the official tutorial, all these three types are used. This makes me feel confused. Here are my questions:

Is it correct to use the sole data type to achieve a better performance? And, which type should I choose? If using the sole data type is not correct, then how should I determine which type to use in different parts of my program?

答案1

得分: 1

看起来这个教程已经非常过时了。NumPy没有int_t，而int已经被弃用（应该使用np.int，而不是int）。

所以我猜这个回答了使用哪种整数选项的问题。只有int，默认为np.int64（至少在我的机器上是这样，也可能是np.int32）。

不过，请注意，对于int，还有不同的大小选项，比如int32和int64，如果你知道你要存储的整数的大小，可以在它们之间进行选择。

英文:

It seems that this tutorial is very outdated. NumPy doesn't have int_t, and int is deprecated (np.int, not int).

So I guess this answers the question on which of the int options to use. There's only int, which defaults to np.int64 (at least on my machine, could maybe also be np.int32).

However, note that there are different size options for int, like int32 and int64, which you may choose between if you know the size of the integers you wish to store.

答案2

得分: -7

其他回答部分不幸部分正确，而Cython文档看起来不幸部分已过时。

因此，有两种情况下您可以在Cython中使用Numpy类型：

它们是可以作为dtype=参数传递给Numpy函数的Python对象。这些指示Numpy创建什么类型的数组。从Cython的角度来看，它们与任何其他Python对象相同。然而，Numpy将它们视为特殊指示符。np.int是其中的一个示例（但现在已被弃用，建议使用普通的Python int）。特定大小的整数数据类型，如np.int32仍然可用。
```
arr = np.zeros((5, 10), dtype=np.int)
```
这些与Cython无关。它们与您在普通Python代码中使用的方式相同。
第二种用法是作为C整数类型。np.int_t 确实存在（这是其他答案错误的地方）。然而，它是一个仅在包装Cython的Numpy内部的.pxd文件中暴露出来的C typedef。您从Numpy cimport 这些类型，而不是从Numpy import。

您可以在需要C类型的任何地方使用这些类型（例如，cdef int_t some_var或cdef int_t[:] some_memoryview）。它们基本上与dtypes具有相同的名称，但以_t结尾。

作为如何结合这两者的示例，您可以创建一个2D内存视图，并使用以下行来为其查看分配一个数组：

cdef np.int32_t[:,:] mview = np.zeros((5, 10), dtype=np.int32)

普通的int类型在Cython中有两种含义。它可以用作普通的Python整数对象（例如，您可以将其作为dtype参数传递）。但在其他情况下，Python会将其解释为C整数。因此，您可以执行以下操作：

cdef int[:,:] mview = np.zeros((5, 10), dtype=int)

这也可以工作。第一种用法将其用作C类型。第二种用法将其用作普通的Python对象。

这有点令人困惑，因为Cython同时涉及Python（在其中类型只是像任何其他Python对象一样的Python对象）和C（在其中类型用于声明变量，但不是以自己的方式传递的对象）；因此，不总是清楚哪些部分类似于C，哪些部分类似于Python。

英文:

The other answer is unfortunately partly right, and the Cython documentation looks like it's unfortunately partly outdated.

So there's essentially two contexts where you use Numpy types with Cython:

The are Python objects that can be passed to Numpy functions as a dtype= argument. These indicate to Numpy what type of array to create. From Cython's point of view they're the same as any other Python object. However, Numpy treats them as special indicators. np.int was an example of these (but has now been removed in favour of just using the normal Python int). Specificity sized integer dtypes like np.int32 are still available though.
```
arr = np.zeros((5,10), dtype=np.int)
```
These are not Cython-specific. They're the same as you'd use in normal Python code.
The second use is as C integer types. np.int_t does exist (this is where the other answer is wrong). However, it's a C typedef that's only exposed in the .pxd file that wraps the Numpy internals for Cython. They're what you cimport from Numpy, rather than what you import from Numpy.

You use these types anywhere that a C type would be expected (e.g. cdef int_t some_var or cdef int_t[:] some_memoryview). They largely have the same name as the dtypes but with _t on the end.

As an example of how you'd combine the two, you can create a 2D memoryview, and allocate an array for it to view with the line

cdef np.int32_t[:,:] mview = np.zeros((5, 10), dtype=np.int32)

The plain int type has two meanings in Cython. It can be used as the normal Python integer object (e.g. you can pass it as a dtype argument). However, in other contexts Python interprets it as a C integer. Therefore you could do

cdef int[:,:] mview = np.zeros((5, 10), dtype=int)

and this would also work. The first use it's used as a C type. The second as a normal Python object.

It's slightly confusing because Cython straddles Python (where types are just Python objects like any other Python object) and C (where a type is used to declare a variable, but is not an object to be passed around in its own right) and it isn't always clear which bits are C-like and which bits are Python-like.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

NumPy在Cython中：编译时类型与原始类型

问题

答案1

答案2

如何创建新列并按列组分配值

ImportError: 无法从’keras.models’导入’name’ ‘Input’

一个可迭代的 NetworkX 图是什么？

阿拉伯文本正在报告实验室中反转行。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。