英文:
NumPy in Cython: compile-time type vs original type
问题
在Cython中,存在与NumPy相对应的编译时类型。似乎编译时类型比原始类型更快。如果我们将它们与C类型结合使用,例如,有三个关键字可用于定义整数类型:int
、np.int
、np.int_t
。
在官方教程中,这三种类型都被使用了。这让我感到困惑。以下是我的问题:
-
使用唯一的数据类型来实现更好的性能是否正确?还有,我应该选择哪种类型?
-
如果使用唯一的数据类型不正确,那么在程序的不同部分如何确定使用哪种类型?
英文:
In Cython, there are corresponding compile-time types to NumPy. It seems that the compile-time type are faster than the original type. If we combine them with the C type, for example, there are three keywords that can be used to define a integer type: int
, np.int
, np.int_t
.
In the official tutorial, all these three types are used. This makes me feel confused. Here are my questions:
Is it correct to use the sole data type to achieve a better performance? And, which type should I choose? If using the sole data type is not correct, then how should I determine which type to use in different parts of my program?
答案1
得分: 1
看起来这个教程已经非常过时了。NumPy没有int_t
,而int
已经被弃用(应该使用np.int
,而不是int
)。
所以我猜这个回答了使用哪种整数选项的问题。只有int
,默认为np.int64
(至少在我的机器上是这样,也可能是np.int32
)。
不过,请注意,对于int
,还有不同的大小选项,比如int32
和int64
,如果你知道你要存储的整数的大小,可以在它们之间进行选择。
英文:
It seems that this tutorial is very outdated. NumPy doesn't have int_t
, and int
is deprecated (np.int
, not int
).
So I guess this answers the question on which of the int options to use. There's only int
, which defaults to np.int64
(at least on my machine, could maybe also be np.int32
).
However, note that there are different size options for int
, like int32
and int64
, which you may choose between if you know the size of the integers you wish to store.
答案2
得分: -7
其他回答部分不幸部分正确,而Cython文档看起来不幸部分已过时。
因此,有两种情况下您可以在Cython中使用Numpy类型:
-
它们是可以作为
dtype=
参数传递给Numpy函数的Python对象。这些指示Numpy创建什么类型的数组。从Cython的角度来看,它们与任何其他Python对象相同。然而,Numpy将它们视为特殊指示符。np.int
是其中的一个示例(但现在已被弃用,建议使用普通的Pythonint
)。特定大小的整数数据类型,如np.int32
仍然可用。arr = np.zeros((5, 10), dtype=np.int)
这些与Cython无关。它们与您在普通Python代码中使用的方式相同。
-
第二种用法是作为C整数类型。
np.int_t
确实存在(这是其他答案错误的地方)。然而,它是一个仅在包装Cython的Numpy内部的.pxd文件中暴露出来的C typedef。您从Numpycimport
这些类型,而不是从Numpyimport
。您可以在需要C类型的任何地方使用这些类型(例如,
cdef int_t some_var
或cdef int_t[:] some_memoryview
)。它们基本上与dtypes
具有相同的名称,但以_t
结尾。
作为如何结合这两者的示例,您可以创建一个2D内存视图,并使用以下行来为其查看分配一个数组:
cdef np.int32_t[:,:] mview = np.zeros((5, 10), dtype=np.int32)
普通的int
类型在Cython中有两种含义。它可以用作普通的Python整数对象(例如,您可以将其作为dtype
参数传递)。但在其他情况下,Python会将其解释为C整数。因此,您可以执行以下操作:
cdef int[:,:] mview = np.zeros((5, 10), dtype=int)
这也可以工作。第一种用法将其用作C类型。第二种用法将其用作普通的Python对象。
这有点令人困惑,因为Cython同时涉及Python(在其中类型只是像任何其他Python对象一样的Python对象)和C(在其中类型用于声明变量,但不是以自己的方式传递的对象);因此,不总是清楚哪些部分类似于C,哪些部分类似于Python。
英文:
The other answer is unfortunately partly right, and the Cython documentation looks like it's unfortunately partly outdated.
So there's essentially two contexts where you use Numpy types with Cython:
-
The are Python objects that can be passed to Numpy functions as a
dtype=
argument. These indicate to Numpy what type of array to create. From Cython's point of view they're the same as any other Python object. However, Numpy treats them as special indicators.np.int
was an example of these (but has now been removed in favour of just using the normal Pythonint
). Specificity sized integer dtypes likenp.int32
are still available though.arr = np.zeros((5,10), dtype=np.int)
These are not Cython-specific. They're the same as you'd use in normal Python code.
-
The second use is as C integer types.
np.int_t
does exist (this is where the other answer is wrong). However, it's a C typedef that's only exposed in the .pxd file that wraps the Numpy internals for Cython. They're what youcimport
from Numpy, rather than what youimport
from Numpy.You use these types anywhere that a C type would be expected (e.g.
cdef int_t some_var
orcdef int_t[:] some_memoryview
). They largely have the same name as thedtypes
but with_t
on the end.
As an example of how you'd combine the two, you can create a 2D memoryview, and allocate an array for it to view with the line
cdef np.int32_t[:,:] mview = np.zeros((5, 10), dtype=np.int32)
The plain int
type has two meanings in Cython. It can be used as the normal Python integer object (e.g. you can pass it as a dtype
argument). However, in other contexts Python interprets it as a C integer. Therefore you could do
cdef int[:,:] mview = np.zeros((5, 10), dtype=int)
and this would also work. The first use it's used as a C type. The second as a normal Python object.
It's slightly confusing because Cython straddles Python (where types are just Python objects like any other Python object) and C (where a type is used to declare a variable, but is not an object to be passed around in its own right) and it isn't always clear which bits are C-like and which bits are Python-like.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论