获取YOLOv8 onnx模型的边界框、置信度分数和类别标签,使用OpenCV DNN模块。

huangapple go评论100阅读模式
英文:

Get bounding box, the confidence score, and class labels from YOLOv8 onnx model using OpenCV DNN module

问题

以下是您提供的内容的翻译:

我正在开发一个Android应用程序,其中我已经在使用OpenCV,我从YOLOv8获得了一个ONNX格式的模型。以下是它的输出元数据。

  • 名称 - output0
  • 类型 - float32[1,5,8400]

到目前为止,我已成功运行该模型,但最后得到的输出我无法理解。

这是输出的打印语句:

Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]

class Detector(private val context: Context) {
    private var net: Net? = null

    fun detect(frame: Bitmap) {
        // 预处理图像
        val mat = Mat()
        Utils.bitmapToMat(resizedBitmap, mat)
        Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
        val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
        net?.setInput(inputBlob)
        val outputBlob = net?.forward() ?: return
        println(outputBlob)
    }

    fun setupDetector() {
        val modelFile = File(context.cacheDir, MODEL_NAME)
        if (!modelFile.exists()) {
            try {
                val inputStream = context.assets.open(MODEL_NAME)
                val size = inputStream.available()
                val buffer = ByteArray(size)
                inputStream.read(buffer)
                inputStream.close()
                val outputStream = FileOutputStream(modelFile)
                outputStream.write(buffer)
                outputStream.close()
                net = Dnn.readNetFromONNX(modelFile.absolutePath)
            } catch (e: Exception) {
                throw RuntimeException(e)
            }
        } else {
            net = Dnn.readNetFromONNX(modelFile.absolutePath)
        }
    }

    companion object {
        private const val MODEL_NAME = "model.onnx"
        private const val TENSOR_WIDTH = 640
        private const val TENSOR_HEIGHT = 640
    }
}

关于如何获取边界框、置信度分数和类别标签的一般方法是:

  1. 解析输出Blob:首先,您需要解析输出Blob,以获取模型的输出。根据您提供的元数据,输出Blob的形状为[1,5,8400],其中1是批次大小,5是每个边界框的属性数,8400是边界框的总数。

  2. 后处理:一旦您有了输出Blob,您需要对其进行后处理,以提取边界框、置信度分数和类别标签。通常,这涉及到应用阈值来筛选具有足够高置信度的边界框,然后将其与相应的类别标签关联。

  3. 边界框坐标:从每个边界框中提取坐标,通常是左上角和右下角的坐标。这些坐标可以用于在图像上绘制边界框或进行进一步的分析。

  4. 置信度分数:每个边界框都有一个与之关联的置信度分数,表示模型对该边界框包含对象的置信程度。

  5. 类别标签:根据模型的类别标签列表,将每个边界框与相应的类别标签关联起来。

请注意,具体的实现细节可能因模型和应用而异。如果您需要更详细的帮助或有关ONNX模型在OpenCV中的使用的解决方案,建议查看OpenCV和ONNX的官方文档以获取更多信息。此问题不限于Android,适用于任何使用OpenCV和ONNX的环境。

英文:

I am working on an Android app where I am already using OpenCV, I got a model which is in onnx format from YOLOv8 after conversion. Here is the output metadata of it.

  • name - output0
  • type - float32[1,5,8400]

So far I am successfully running the model but in the end, the output that I got I can't comprehend.

This is the print statement from the output

Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]

class Detector(private val context: Context) {
    private var net: Net? = null

    fun detect(frame: Bitmap) {
        // preprocess image
        val mat = Mat()
        Utils.bitmapToMat(resizedBitmap, mat)
        Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
        val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
        net?.setInput(inputBlob)
        val outputBlob = net?.forward() ?: return
        println(outputBlob)
    }

    fun setupDetector() {
        val modelFile = File(context.cacheDir, MODEL_NAME)
        if (!modelFile.exists()) {
            try {
                val inputStream = context.assets.open(MODEL_NAME)
                val size = inputStream.available()
                val buffer = ByteArray(size)
                inputStream.read(buffer)
                inputStream.close()
                val outputStream = FileOutputStream(modelFile)
                outputStream.write(buffer)
                outputStream.close()
                net = Dnn.readNetFromONNX(modelFile.absolutePath)
            } catch (e: Exception) {
                throw RuntimeException(e)
            }
        } else {
            net = Dnn.readNetFromONNX(modelFile.absolutePath)
        }
    }

    companion object {
        private const val MODEL_NAME = "model.onnx"
        private const val TENSOR_WIDTH = 640
        private const val TENSOR_HEIGHT = 640
    }
}

What could be the general approach to get bounding box, the confidence score and class labels? And if you have any solution for onnx model with OpenCV then you can provide as well. Also this question isn't android specific.

答案1

得分: 0

以下是您要翻译的代码部分:

val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)

val inputBlob = Dnn.blobFromImage(
    mat,
    1.0/255.0,
    Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
    Scalar(0.0),
    false,
    false
)

net?.setInput(inputBlob)

val outputBlob = net?.forward() ?: return

val strip = outputBlob.reshape(1, outputBlob.size(1))

val transposedMat = Mat()
Core.transpose(strip, transposedMat)

val boundingBoxes = mutableListOf<BoundingBox>()
for (i in 0 until transposedMat.rows()) {
    if (transposedMat.get(i, 4)[0] > CONFIDENCE_THRESHOLD) {
        boundingBoxes.add(
            BoundingBox(
                transposedMat.get(i, 0)[0],
                transposedMat.get(i, 1)[0],
                transposedMat.get(i, 2)[0],
                transposedMat.get(i, 3)[0],
                transposedMat.get(i, 4)[0]
            )
        )
    }
}
data class BoundingBox(
    val centerX: Double,
    val centerY: Double,
    val width: Double,
    val height: Double,
    val confidence: Double
)

如果您需要进一步的帮助,请告诉我。

英文:

With the suggestion that I got from comments, I dig into the YOLOv8 and this is the solution that I came up with.

val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)

val inputBlob = Dnn.blobFromImage(
    mat,
    1.0/255.0,
    Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
    Scalar(0.0),
    false,
    false
)

net?.setInput(inputBlob)

val outputBlob = net?.forward() ?: return

val strip = outputBlob.reshape(1, outputBlob.size(1))

val transposedMat = Mat()
Core.transpose(strip, transposedMat)

val boundingBoxes = mutableListOf&lt;BoundingBox&gt;()
for (i in 0 until transposedMat.rows()) {
    if (transposedMat.get(i, 4)[0] &gt; CONFIDENCE_THRESHOLD) {
        boundingBoxes.add(
            BoundingBox(
                transposedMat.get(i, 0)[0],
                transposedMat.get(i, 1)[0],
                transposedMat.get(i, 2)[0],
                transposedMat.get(i, 3)[0],
                transposedMat.get(i, 4)[0]
            )
        )
    }
}
data class BoundingBox(
    val centerX: Double,
    val centerY: Double,
    val width: Double,
    val height: Double,
    val confidence: Double
)

huangapple
  • 本文由 发表于 2023年6月29日 00:34:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575124.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定