英文:
Get bounding box, the confidence score, and class labels from YOLOv8 onnx model using OpenCV DNN module
问题
以下是您提供的内容的翻译:
我正在开发一个Android应用程序,其中我已经在使用OpenCV,我从YOLOv8获得了一个ONNX格式的模型。以下是它的输出元数据。
- 名称 - output0
- 类型 - float32[1,5,8400]
到目前为止,我已成功运行该模型,但最后得到的输出我无法理解。
这是输出的打印语句:
Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]
class Detector(private val context: Context) {
private var net: Net? = null
fun detect(frame: Bitmap) {
// 预处理图像
val mat = Mat()
Utils.bitmapToMat(resizedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
net?.setInput(inputBlob)
val outputBlob = net?.forward() ?: return
println(outputBlob)
}
fun setupDetector() {
val modelFile = File(context.cacheDir, MODEL_NAME)
if (!modelFile.exists()) {
try {
val inputStream = context.assets.open(MODEL_NAME)
val size = inputStream.available()
val buffer = ByteArray(size)
inputStream.read(buffer)
inputStream.close()
val outputStream = FileOutputStream(modelFile)
outputStream.write(buffer)
outputStream.close()
net = Dnn.readNetFromONNX(modelFile.absolutePath)
} catch (e: Exception) {
throw RuntimeException(e)
}
} else {
net = Dnn.readNetFromONNX(modelFile.absolutePath)
}
}
companion object {
private const val MODEL_NAME = "model.onnx"
private const val TENSOR_WIDTH = 640
private const val TENSOR_HEIGHT = 640
}
}
关于如何获取边界框、置信度分数和类别标签的一般方法是:
-
解析输出Blob:首先,您需要解析输出Blob,以获取模型的输出。根据您提供的元数据,输出Blob的形状为[1,5,8400],其中1是批次大小,5是每个边界框的属性数,8400是边界框的总数。
-
后处理:一旦您有了输出Blob,您需要对其进行后处理,以提取边界框、置信度分数和类别标签。通常,这涉及到应用阈值来筛选具有足够高置信度的边界框,然后将其与相应的类别标签关联。
-
边界框坐标:从每个边界框中提取坐标,通常是左上角和右下角的坐标。这些坐标可以用于在图像上绘制边界框或进行进一步的分析。
-
置信度分数:每个边界框都有一个与之关联的置信度分数,表示模型对该边界框包含对象的置信程度。
-
类别标签:根据模型的类别标签列表,将每个边界框与相应的类别标签关联起来。
请注意,具体的实现细节可能因模型和应用而异。如果您需要更详细的帮助或有关ONNX模型在OpenCV中的使用的解决方案,建议查看OpenCV和ONNX的官方文档以获取更多信息。此问题不限于Android,适用于任何使用OpenCV和ONNX的环境。
英文:
I am working on an Android app where I am already using OpenCV, I got a model which is in onnx format from YOLOv8 after conversion. Here is the output metadata of it.
- name - output0
- type - float32[1,5,8400]
So far I am successfully running the model but in the end, the output that I got I can't comprehend.
This is the print statement from the output
Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]
class Detector(private val context: Context) {
private var net: Net? = null
fun detect(frame: Bitmap) {
// preprocess image
val mat = Mat()
Utils.bitmapToMat(resizedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
net?.setInput(inputBlob)
val outputBlob = net?.forward() ?: return
println(outputBlob)
}
fun setupDetector() {
val modelFile = File(context.cacheDir, MODEL_NAME)
if (!modelFile.exists()) {
try {
val inputStream = context.assets.open(MODEL_NAME)
val size = inputStream.available()
val buffer = ByteArray(size)
inputStream.read(buffer)
inputStream.close()
val outputStream = FileOutputStream(modelFile)
outputStream.write(buffer)
outputStream.close()
net = Dnn.readNetFromONNX(modelFile.absolutePath)
} catch (e: Exception) {
throw RuntimeException(e)
}
} else {
net = Dnn.readNetFromONNX(modelFile.absolutePath)
}
}
companion object {
private const val MODEL_NAME = "model.onnx"
private const val TENSOR_WIDTH = 640
private const val TENSOR_HEIGHT = 640
}
}
What could be the general approach to get bounding box, the confidence score and class labels? And if you have any solution for onnx model with OpenCV then you can provide as well. Also this question isn't android specific.
答案1
得分: 0
以下是您要翻译的代码部分:
val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
val inputBlob = Dnn.blobFromImage(
mat,
1.0/255.0,
Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
Scalar(0.0),
false,
false
)
net?.setInput(inputBlob)
val outputBlob = net?.forward() ?: return
val strip = outputBlob.reshape(1, outputBlob.size(1))
val transposedMat = Mat()
Core.transpose(strip, transposedMat)
val boundingBoxes = mutableListOf<BoundingBox>()
for (i in 0 until transposedMat.rows()) {
if (transposedMat.get(i, 4)[0] > CONFIDENCE_THRESHOLD) {
boundingBoxes.add(
BoundingBox(
transposedMat.get(i, 0)[0],
transposedMat.get(i, 1)[0],
transposedMat.get(i, 2)[0],
transposedMat.get(i, 3)[0],
transposedMat.get(i, 4)[0]
)
)
}
}
data class BoundingBox(
val centerX: Double,
val centerY: Double,
val width: Double,
val height: Double,
val confidence: Double
)
如果您需要进一步的帮助,请告诉我。
英文:
With the suggestion that I got from comments, I dig into the YOLOv8 and this is the solution that I came up with.
val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
val inputBlob = Dnn.blobFromImage(
mat,
1.0/255.0,
Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
Scalar(0.0),
false,
false
)
net?.setInput(inputBlob)
val outputBlob = net?.forward() ?: return
val strip = outputBlob.reshape(1, outputBlob.size(1))
val transposedMat = Mat()
Core.transpose(strip, transposedMat)
val boundingBoxes = mutableListOf<BoundingBox>()
for (i in 0 until transposedMat.rows()) {
if (transposedMat.get(i, 4)[0] > CONFIDENCE_THRESHOLD) {
boundingBoxes.add(
BoundingBox(
transposedMat.get(i, 0)[0],
transposedMat.get(i, 1)[0],
transposedMat.get(i, 2)[0],
transposedMat.get(i, 3)[0],
transposedMat.get(i, 4)[0]
)
)
}
}
data class BoundingBox(
val centerX: Double,
val centerY: Double,
val width: Double,
val height: Double,
val confidence: Double
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论