英文:
Rotate an image by 90 degrees using Eigen from OpenCV Matrix in C++
问题
如何使用Eigen从OpenCV矩阵旋转图像90度,然后将旋转后的图像转换回OpenCV矩阵在C++中。OpenCV的rotate
函数需要时间,我希望尽快完成。我尝试过在Python中使用Numpy的rot90
函数,与C++中的OpenCV的rotate
函数相比,它非常快。不幸的是,Numpy不适用于C++。我已经了解到C++中有其他库,如Eigen和Armadillo,可以快速进行这些矩阵操作。这就是我想要使用Eigen旋转图像并检查时间的原因。
我在Windows 10上的i5机器上使用Visual Studio 2019测试了这些函数。Python中的numpy rot90
函数大约比C++中的OpenCV rotate
函数快10倍。
英文:
How can I rotate an image by 90 degrees using Eigen from OpenCV Matrix and then convert the rotated image back to OpenCV Matrix in C++. The rotate
function of OpenCV takes time and I want to do it as fast as possible. I have tried using Numpy rot90
function in Python and it is extremely fast compared to OpenCV rotate
function in C++. Unfortunately Numpy is not available for C++. I have read that there are other libraries like Eigen and Armadillo in C++ which can do these matrix operations quickly. That is the reason I want to rotate the image using Eigen and check the timing.
I tested the functions in Visual Studio 2019 on an i5 machine in Windows 10. The numpy rot90
function in Python is roughly 10 times faster than the OpenCV rotate
function in C++.
答案1
得分: 1
我猜warpAffine
函数更快。至少你应该进行比较来确认。
这里有一个示例:
https://docs.opencv.org/master/dd/d52/tutorial_js_geometric_transformations.html.
相同类型的函数也可使用cuda进行加速:
https://docs.opencv.org/master/db/d29/group__cudawarping.html
编辑:
在OpenCV中,warpAffine
实际上可以使用Intel Performance Primitives库中的ippiWarpAffine*
函数。这可能是最快的性能。如果您的软件可以在带有NVIDIA GPU的平台上运行,CUDA版本预计会更快。性能取决于您使用的数据类型。如果您可以使用8位无符号图像,速度会更快。
编辑 2:
在有人评论说warpAffine
较慢之后,我进行了一些测试,有时它可能更快。然而,与numpy的旋转相比,甚至cv2的翻转或cv2的转置都无法相提并论,甚至速度更慢。因此,我建议查阅Intel开发者社区的这些建议,使用ippiRotate和ippiMirror函数执行90度旋转。如果您真的希望在Intel CPU上获得最佳性能,这可能是我猜测的方法。还要注意多线程,一些函数在IPP中可以多线程处理。最终,这取决于您是寻找旋转单个大型图像的解决方案,还是寻找多个图像的解决方案,以及数据类型、通道数量。至少使用IPP时,您可以为您的数据类型选择最佳函数。
以下是一些用Python进行比较的试验,与numpy的rot90
函数进行比较。当然,结果可能会因参数而变化,但与numpy相比仍然存在很大差异。从我的试验中也无法明确cv2.rotate是否更快。
100次 np.rot90 时间 : 0.001626729965209961
100次 cv2.rotate 时间 : 0.21501994132995605
100次 cv2.transpose 时间 : 0.18512678146362305
100次 cv2.remap 时间 : 0.6473801136016846
100次 cv2.warpAffine 时间 : 0.11946868896484375
import cv2
import numpy as np
import time
img = np.random.randint(0, 255, (1000, 1000, 3)).astype(np.uint8)
##################################
start = time.time()
for i in range(100):
rotated = np.rot90(img)
end = time.time()
print("100次 np.rot90 时间 :", end - start)
##################################
start = time.time()
for i in range(100):
rotated = cv2.rotate(img, cv2.ROTATE_90_COUNTERCLOCKWISE)
end = time.time()
print("100次 cv2.rotate 时间 :", end - start)
##################################
start = time.time()
for i in range(100):
rotated = cv2.transpose(img, 1)
end = time.time()
print("100次 cv2.transpose 时间 :", end - start)
##################################
mapx, mapy = np.meshgrid(np.arange(0, img.shape[1]), np.arange(0, img.shape[0]))
mapx = mapx.transpose()
mapy = mapy.transpose()
start = time.time()
for i in range(100):
rotated = cv2.remap(img, mapx.astype(np.float32), mapy.astype(np.float32), cv2.INTER_NEAREST)
end = time.time()
print("100次 cv2.remap 时间 :", end - start)
##################################
rows = img.shape[0]
cols = img.shape[1]
M = cv2.getRotationMatrix2D((rows / 2, cols / 2), 90, 1)
M[0, 2] = 0
M[1, 2] = cols
start = time.time()
for i in range(100):
rotated = cv2.warpAffine(img, M, (rows, cols), flags=cv2.INTER_NEAREST)
end = time.time()
print("100次 cv2.warpAffine 时间 :", end - start)
希望对您有所帮助!
英文:
I guess that the function warpAffine
is faster. At least you should compare to check.
There is an example here:
https://docs.opencv.org/master/dd/d52/tutorial_js_geometric_transformations.html.
The same kind of functions are available with cuda:
https://docs.opencv.org/master/db/d29/group__cudawarping.html
EDIT:
warpAffine
in OpenCV can actually use the ippiWarpAffine*
function from the Intel Performance Primitives library. This is probably the fastest performance that could get. The cuda version is expected to be faster if you can run your software on a platform with an nvidia gpu. The performance depends on the type of data that you use. If you can use 8bit unsigned images you can be much faster.
EDIT 2:
After the comment saying that warpAffine is slower I ran a few tests and it can sometimes be faster. However, when compare to the numpy's rotate there is nothing comparable, even a cv2.flip or cv2.transpose are way slower. Therefore I would recommend to look into this recommendation on Intel's developer zone which is to use ippiRotate and ippiMirror functions to perform 90 rotations. If you are really interested into getting the best performance out of an Intel cpu, that would be my guess. Also take care about the multithreading, some functions can be multithreaded in IPP. In the end this depend if you look for a solution to rotate a single large image or multiple ones, of the type of data, the number of channels. With IPP at least you use the best function for your type of data.
Hereafter a few trials in python to compare with numpy's rot90
function. Of course the results can change with the parameters but still there is a large difference with numpy. It is also not obvious from my trials that cv2.rotate is so faster.
100x np.rot90 time : 0.001626729965209961
100x cv2.rotate time : 0.21501994132995605
100x cv2.transpose time : 0.18512678146362305
100x cv2.remap time : 0.6473801136016846
100x cv2.warpAffine time : 0.11946868896484375
import cv2
import numpy as np
import time
img = np.random.randint(0, 255, (1000, 1000, 3)).astype(np.uint8)
##################################
start = time.time()
for i in range(100):
rotated = np.rot90(img)
end = time.time()
print("100x np.rot90 time :", end - start)
##################################
start = time.time()
for i in range(100):
rotated = cv2.rotate(img, cv2.ROTATE_90_COUNTERCLOCKWISE)
end = time.time()
print("100x cv2.rotate time :", end - start)
##################################
start = time.time()
for i in range(100):
rotated = cv2.transpose(img, 1)
end = time.time()
print("100x cv2.transpose time :", end - start)
##################################
mapx, mapy = np.meshgrid(np.arange(0, img.shape[1]), np.arange(0, img.shape[0]))
mapx = mapx.transpose()
mapy = mapy.transpose()
start = time.time()
for i in range(100):
rotated = cv2.remap(img, mapx.astype(np.float32), mapy.astype(np.float32), cv2.INTER_NEAREST)
end = time.time()
print("100x cv2.remap time :", end - start)
##################################
rows = img.shape[0]
cols = img.shape[1]
M = cv2.getRotationMatrix2D((rows / 2, cols / 2), 90, 1)
M[0, 2] = 0
M[1, 2] = cols
start = time.time()
for i in range(100):
rotated = cv2.warpAffine(img, M, (rows, cols), flags=cv2.INTER_NEAREST)
end = time.time()
print("100x cv2.warpAffine time :", end - start)
I hope this helps!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论