2023年2月6日 18:58:44go评论96阅读模式

英文:

Running a pre trained ONNX model - image recognition

问题

抱歉，我不能提供代码翻译的服务。

英文:

I am trying to run a pre-trained ONNX model (trained on a third-party labeling tool) for image recognition. The model is trained via some pre-defined labels in the tool. The next aim now is to be able to run this model outside the tool. For the same, I am taking a sample image and trying to run the same via model to get the identified labels as output. While doing so I hit an impediment regarding how to adjust the inputs. The model needs inputs as follows:

How can I adjust my inputs in the following code?

import cv2
import numpy as np
import onnxruntime
import pytesseract
import PyPDF2
# Load the image
image = cv2.imread(&quot;example.jpg&quot;)
# Check if the image has been loaded successfully
if image is None:
    raise ValueError(&quot;Failed to load the image&quot;)
    
# Get the shape of the image
height, width = image.shape[:2]
# Make sure the height and width are positive
if height &lt;= 0 or width &lt;= 0:
    raise ValueError(&quot;Invalid image size&quot;)
# Set the desired size of the resized image
dsize = (640, 640)
# Resize the image using cv2.resize
resized_image = cv2.resize(image, dsize)
# Display the resized image
cv2.imshow(&quot;Resized Image&quot;, resized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Load the ONNX model
session = onnxruntime.InferenceSession(&quot;ic/model.onnx&quot;)
# Check if the model has been loaded successfully
if session is None:
    raise ValueError(&quot;Failed to load the model&quot;)
# Get the input names and shapes of the model
inputs = session.get_inputs()
for i, input_info in enumerate(inputs):
    print(f&quot;Input {i}: name = {input_info.name}, shape = {input_info.shape}&quot;)
# Run the ONNX model
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
prediction = session.run([output_name], {input_name: image})[0]
# Postprocess the prediction to obtain the labels
labels = postprocess(prediction)
# Use PyTesseract to extract the text from the image
text = pytesseract.image_to_string(image)
# Print the labels and the text
print(&quot;Labels:&quot;, labels)
print(&quot;Text:&quot;, text)

Because the code throws the following error:
ValueError: Model requires 4 inputs. Input Feed contains 1

答案1

得分: 1

对于你的情况，你需要将批量(batch)添加到输入中。根据你的报告，你只有图像的形状是('sequence', 640, 640)，但你训练的模型输入是('batch', 'sequence', 224, 224)。
要解决这个问题，你应该添加批量维度(batch dimension)并转置张量，如下所示：

img_batch = np.expand_dims(img_normalized, axis=0)
img_transposed = np.transpose(img_batch, (0, 3, 1, 2))

其中：

np.expand_dims：用于为你的输入图像添加'batch'
np.transpose：用于在正确的位置更改位置，我的意思是，可能在添加'batch'后，图像的形状可能为(640, 1, 3, 640)，然后你需要将其更改为与训练输入模型相同的形状(1, 3, 640, 640)。类似这样。

记得将你的图像调整大小为(224, 224)，而不是(640, 640)。

让我们再试一次，希望对你有帮助。

英文:

For your case, you need to append batch into input. As your report, you only have the shape of image is ('sequence', 640, 640), but your trained model input is ('batch', 'sequence', 224, 224).
To fix this problem, you should add batch dimension and transpose the tensor as example:

img_batch = np.expand_dims(img_normalized, axis=0)
img_transposed = np.transpose(img_batch, (0, 3, 1, 2))

Where:

np.expand_dims: to add 'batch' for your input image
np.transpose: to change the position in right place, I mean it maybe have a shape of image after adding 'batch' as (640, 1, 3, 640), then you need to change as the same as trained input model is (1, 3, 640, 640). Something like this.

Remember to resize your image in a shape of (224, 224) instead of (640, 640).

Let try again, I hope it's helpful for you.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

运行预训练的ONNX模型 – 图像识别

问题

答案1

解决使用Python和Sympy解决涉及三角函数的非线性方程组问题

Selenium Python: 查找下一个元素和/或查找变化的元素

在pyplot中绘制坐标线。

如何使用交错的Hugging Face数据集创建一个PyTorch数据加载器？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。