2023年6月29日 00:34:51go评论103阅读模式

英文:

Get bounding box, the confidence score, and class labels from YOLOv8 onnx model using OpenCV DNN module

问题

以下是您提供的内容的翻译：

我正在开发一个Android应用程序，其中我已经在使用OpenCV，我从YOLOv8获得了一个ONNX格式的模型。以下是它的输出元数据。

名称 - output0
类型 - float32[1,5,8400]

到目前为止，我已成功运行该模型，但最后得到的输出我无法理解。

这是输出的打印语句：

Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]

class Detector(private val context: Context) {
    private var net: Net? = null

    fun detect(frame: Bitmap) {
        // 预处理图像
        val mat = Mat()
        Utils.bitmapToMat(resizedBitmap, mat)
        Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
        val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
        net?.setInput(inputBlob)
        val outputBlob = net?.forward() ?: return
        println(outputBlob)
    }

    fun setupDetector() {
        val modelFile = File(context.cacheDir, MODEL_NAME)
        if (!modelFile.exists()) {
            try {
                val inputStream = context.assets.open(MODEL_NAME)
                val size = inputStream.available()
                val buffer = ByteArray(size)
                inputStream.read(buffer)
                inputStream.close()
                val outputStream = FileOutputStream(modelFile)
                outputStream.write(buffer)
                outputStream.close()
                net = Dnn.readNetFromONNX(modelFile.absolutePath)
            } catch (e: Exception) {
                throw RuntimeException(e)
            }
        } else {
            net = Dnn.readNetFromONNX(modelFile.absolutePath)
        }
    }

    companion object {
        private const val MODEL_NAME = "model.onnx"
        private const val TENSOR_WIDTH = 640
        private const val TENSOR_HEIGHT = 640
    }
}

关于如何获取边界框、置信度分数和类别标签的一般方法是：

解析输出Blob：首先，您需要解析输出Blob，以获取模型的输出。根据您提供的元数据，输出Blob的形状为[1,5,8400]，其中1是批次大小，5是每个边界框的属性数，8400是边界框的总数。
后处理：一旦您有了输出Blob，您需要对其进行后处理，以提取边界框、置信度分数和类别标签。通常，这涉及到应用阈值来筛选具有足够高置信度的边界框，然后将其与相应的类别标签关联。
边界框坐标：从每个边界框中提取坐标，通常是左上角和右下角的坐标。这些坐标可以用于在图像上绘制边界框或进行进一步的分析。
置信度分数：每个边界框都有一个与之关联的置信度分数，表示模型对该边界框包含对象的置信程度。
类别标签：根据模型的类别标签列表，将每个边界框与相应的类别标签关联起来。

请注意，具体的实现细节可能因模型和应用而异。如果您需要更详细的帮助或有关ONNX模型在OpenCV中的使用的解决方案，建议查看OpenCV和ONNX的官方文档以获取更多信息。此问题不限于Android，适用于任何使用OpenCV和ONNX的环境。

英文:

I am working on an Android app where I am already using OpenCV, I got a model which is in onnx format from YOLOv8 after conversion. Here is the output metadata of it.

name - output0
type - float32[1,5,8400]

So far I am successfully running the model but in the end, the output that I got I can't comprehend.

This is the print statement from the output

Mat [ 1* 5* 8400*CV_32FC1, isCont=true, isSubmat=true, nativeObj=0x72345b4840, dataAddr=0x723076b000 ]

class Detector(private val context: Context) {
    private var net: Net? = null

    fun detect(frame: Bitmap) {
        // preprocess image
        val mat = Mat()
        Utils.bitmapToMat(resizedBitmap, mat)
        Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)
        val inputBlob = Dnn.blobFromImage(mat, 1.0/255.0, Size(640.0, 640.0), Scalar(0.0), true, false)
        net?.setInput(inputBlob)
        val outputBlob = net?.forward() ?: return
        println(outputBlob)
    }

    fun setupDetector() {
        val modelFile = File(context.cacheDir, MODEL_NAME)
        if (!modelFile.exists()) {
            try {
                val inputStream = context.assets.open(MODEL_NAME)
                val size = inputStream.available()
                val buffer = ByteArray(size)
                inputStream.read(buffer)
                inputStream.close()
                val outputStream = FileOutputStream(modelFile)
                outputStream.write(buffer)
                outputStream.close()
                net = Dnn.readNetFromONNX(modelFile.absolutePath)
            } catch (e: Exception) {
                throw RuntimeException(e)
            }
        } else {
            net = Dnn.readNetFromONNX(modelFile.absolutePath)
        }
    }

    companion object {
        private const val MODEL_NAME = &quot;model.onnx&quot;
        private const val TENSOR_WIDTH = 640
        private const val TENSOR_HEIGHT = 640
    }
}

What could be the general approach to get bounding box, the confidence score and class labels? And if you have any solution for onnx model with OpenCV then you can provide as well. Also this question isn't android specific.

答案1

得分: 0

以下是您要翻译的代码部分：

val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)

val inputBlob = Dnn.blobFromImage(
    mat,
    1.0/255.0,
    Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
    Scalar(0.0),
    false,
    false
)

net?.setInput(inputBlob)

val outputBlob = net?.forward() ?: return

val strip = outputBlob.reshape(1, outputBlob.size(1))

val transposedMat = Mat()
Core.transpose(strip, transposedMat)

val boundingBoxes = mutableListOf<BoundingBox>()
for (i in 0 until transposedMat.rows()) {
    if (transposedMat.get(i, 4)[0] > CONFIDENCE_THRESHOLD) {
        boundingBoxes.add(
            BoundingBox(
                transposedMat.get(i, 0)[0],
                transposedMat.get(i, 1)[0],
                transposedMat.get(i, 2)[0],
                transposedMat.get(i, 3)[0],
                transposedMat.get(i, 4)[0]
            )
        )
    }
}

data class BoundingBox(
    val centerX: Double,
    val centerY: Double,
    val width: Double,
    val height: Double,
    val confidence: Double
)

如果您需要进一步的帮助，请告诉我。

英文:

With the suggestion that I got from comments, I dig into the YOLOv8 and this is the solution that I came up with.

val mat = Mat()
Utils.bitmapToMat(croppedBitmap, mat)
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGBA2RGB)

val inputBlob = Dnn.blobFromImage(
    mat,
    1.0/255.0,
    Size(TENSOR_WIDTH_DOUBLE, TENSOR_HEIGHT_DOUBLE),
    Scalar(0.0),
    false,
    false
)

net?.setInput(inputBlob)

val outputBlob = net?.forward() ?: return

val strip = outputBlob.reshape(1, outputBlob.size(1))

val transposedMat = Mat()
Core.transpose(strip, transposedMat)

val boundingBoxes = mutableListOf&lt;BoundingBox&gt;()
for (i in 0 until transposedMat.rows()) {
    if (transposedMat.get(i, 4)[0] &gt; CONFIDENCE_THRESHOLD) {
        boundingBoxes.add(
            BoundingBox(
                transposedMat.get(i, 0)[0],
                transposedMat.get(i, 1)[0],
                transposedMat.get(i, 2)[0],
                transposedMat.get(i, 3)[0],
                transposedMat.get(i, 4)[0]
            )
        )
    }
}

data class BoundingBox(
    val centerX: Double,
    val centerY: Double,
    val width: Double,
    val height: Double,
    val confidence: Double
)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

获取YOLOv8 onnx模型的边界框、置信度分数和类别标签，使用OpenCV DNN模块。

问题

答案1

将权重从Keras分配给Torch模型。

如何正确创建多输入神经网络

Problems in implementing adaptive thresholding using CUDA

将图像分类模型转化为分层模型

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论