运行预训练的ONNX模型 – 图像识别

huangapple go评论62阅读模式
英文:

Running a pre trained ONNX model - image recognition

问题

抱歉,我不能提供代码翻译的服务。

英文:

I am trying to run a pre-trained ONNX model (trained on a third-party labeling tool) for image recognition. The model is trained via some pre-defined labels in the tool. The next aim now is to be able to run this model outside the tool. For the same, I am taking a sample image and trying to run the same via model to get the identified labels as output. While doing so I hit an impediment regarding how to adjust the inputs. The model needs inputs as follows:
运行预训练的ONNX模型 – 图像识别

How can I adjust my inputs in the following code?

import cv2
import numpy as np
import onnxruntime
import pytesseract
import PyPDF2

# Load the image
image = cv2.imread("example.jpg")

# Check if the image has been loaded successfully
if image is None:
    raise ValueError("Failed to load the image")
    
# Get the shape of the image
height, width = image.shape[:2]

# Make sure the height and width are positive
if height <= 0 or width <= 0:
    raise ValueError("Invalid image size")

# Set the desired size of the resized image
dsize = (640, 640)

# Resize the image using cv2.resize
resized_image = cv2.resize(image, dsize)

# Display the resized image
cv2.imshow("Resized Image", resized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

# Load the ONNX model
session = onnxruntime.InferenceSession("ic/model.onnx")

# Check if the model has been loaded successfully
if session is None:
    raise ValueError("Failed to load the model")

# Get the input names and shapes of the model
inputs = session.get_inputs()
for i, input_info in enumerate(inputs):
    print(f"Input {i}: name = {input_info.name}, shape = {input_info.shape}")

# Run the ONNX model
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
prediction = session.run([output_name], {input_name: image})[0]

# Postprocess the prediction to obtain the labels
labels = postprocess(prediction)

# Use PyTesseract to extract the text from the image
text = pytesseract.image_to_string(image)

# Print the labels and the text
print("Labels:", labels)
print("Text:", text)

Because the code throws the following error:
ValueError: Model requires 4 inputs. Input Feed contains 1

答案1

得分: 1

对于你的情况,你需要将批量(batch)添加到输入中。根据你的报告,你只有图像的形状是('sequence', 640, 640),但你训练的模型输入是('batch', 'sequence', 224, 224)。
要解决这个问题,你应该添加批量维度(batch dimension)并转置张量,如下所示:

img_batch = np.expand_dims(img_normalized, axis=0)
img_transposed = np.transpose(img_batch, (0, 3, 1, 2))

其中:

  • np.expand_dims:用于为你的输入图像添加'batch'
  • np.transpose:用于在正确的位置更改位置,我的意思是,可能在添加'batch'后,图像的形状可能为(640, 1, 3, 640),然后你需要将其更改为与训练输入模型相同的形状(1, 3, 640, 640)。类似这样。

记得将你的图像调整大小为(224, 224),而不是(640, 640)。

让我们再试一次,希望对你有帮助。

英文:

For your case, you need to append batch into input. As your report, you only have the shape of image is ('sequence', 640, 640), but your trained model input is ('batch', 'sequence', 224, 224).
To fix this problem, you should add batch dimension and transpose the tensor as example:

img_batch = np.expand_dims(img_normalized, axis=0)
img_transposed = np.transpose(img_batch, (0, 3, 1, 2))

Where:

  • np.expand_dims: to add 'batch' for your input image
  • np.transpose: to change the position in right place, I mean it maybe have a shape of image after adding 'batch' as (640, 1, 3, 640), then you need to change as the same as trained input model is (1, 3, 640, 640). Something like this.

Remember to resize your image in a shape of (224, 224) instead of (640, 640).

Let try again, I hope it's helpful for you.

huangapple
  • 本文由 发表于 2023年2月6日 18:58:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360420.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定