
huangapple go评论51阅读模式

Tflite custom object detection model for flutter application using google_mlkit_object_detection package


I can help you with the translation. Here is the content you provided in Chinese:


前几个模型,我是使用yolov8创建的,并将其转换为tflite格式。标注是使用roboflow进行的,我使用的google colab笔记本是由他们提供的,转换后的tflite模型的元数据如下图所示




因此,如建议的,我尝试更改元数据并添加NormalizationOptions,但未成功。我的第二个选择是使用官方的TensorFlow google colab笔记本TensorFlow Lite Model Maker来训练模型,它生成了以下元数据的模型








I'm trying to create a custom object detection model in tflite format so that I can use it in a flutter application with the google_mlkit_object_detection package.

The first few models, I created using yolov8 and converted to tflite. The annotations were made with roboflow and the google colab notebook I used was provided by them, the metadata of the converted tflite model looks like the following image


On this model I was getting the error

Input tensor has type kTfLiteFloat32: it requires specifying NormalizationOptions metadata to preprocess input images.

So as suggested I tried to change the metadata and add normalizationOptions but failed to do so. My second alternative was to train a model with the official TensorFlow google colab notebook TensorFlow Lite Model Maker and it generated a model with the following metadata


For this model the error was

Unexpected number of dimensions for output index 1: got 3D, expected either 2D (BxN with B=1) or 4D

So I checked the model from the example app from the package I am using "google_mlkit_object_detection" and the metadata looks like this


So my question is, how can I alter the models I already trained whichever it is easier, to look like this, both input and output, do I have to alter my model's architecture or just the metadata? The second one trained with the official notebook from tensor flow, it seems that all I have to do is include the correct shape format [1,N], but again I might have to change the architecture.


得分: 1

Custom Models页面上的Google文档中提到:

> 注意:ML Kit仅支持自定义图像分类模型。
> 虽然AutoML Vision允许训练目标检测模型,
> 但这些模型无法与ML Kit一起使用。

所以,你可以使用自定义目标检测模型,但前提是它们必须是图像分类模型。我以为这是不可能的,因为目标检测模型输出边界框,而分类模型输出类别分数。然而,我尝试了YOLOv8模型,标准的目标检测模型无法使用,但是分类模型,其输出形状为[1, 1000],实际上可以在Google MLKit示例应用程序中使用,并且你可以从中提取边界框。


无论如何,简单的答案是:使用输出形状为[1, N]或[1, 1, 1, N]的分类模型,其中N是类别数。如果你有一个不同架构的模型,那么你应该将输出更改为这种格式,否则它是无法正常工作的。


The Custom Models page on the Google documentation says this:

> Note: ML Kit only supports custom image classification models.
> Although AutoML Vision allows training of object detection models,
> these cannot be used with ML Kit.

So there it is, you can use custom object detection models, but only if they are image classification models. I thought this must be impossible, an object detection model outputs bounding boxes, while a classification model outputs class scores. However, I tried with the YOLOv8 model and the standard object detection model wouldn't work, but the classification model with the [1, 1000] output shape actually works with the Google MLKit example application and you can extract the bounding boxes from it.

I'm not 100% sure how this can work, but what I suspect is that there is a default object detector bundled with the package, which identifies where there could be objects, and then you can only modify the classification model on top of it.

Anyways the simple answer is: Use a classification model with a [1, N] or [1, 1, 1, N] output where N is the number of classes. If you have a model with a different architecture, then you should change the output to this format, otherwise it is not supposed to work.


得分: 0



ML Kit目标检测自定义模型是一个分类器模型。


Metadata is just for providing the information of the model. Beside adding metadata, you need to make sure your model really meet the requirements.

More details about how to get such a model can be found here:

The ML Kit Object Detection custom model is a classifier model.


得分: 0



import 'package:image/image.dart' as img;

Future<Uint8List> imageToByteListFloat32(
  img.Image image, int inputSize, double mean, double std) async {
  var convertedBytes = Float32List(1 * inputSize * inputSize * 3);
  var buffer = Float32List.view(convertedBytes.buffer);
  int pixelIndex = 0;
  for (var i = 0; i < inputSize; i++) {
    for (var j = 0; j < inputSize; j++) {
      var pixel = image.getPixel(j, i);
      buffer[pixelIndex++] = (pixel.r - mean) / std;
      buffer[pixelIndex++] = (pixel.g - mean) / std;
      buffer[pixelIndex++] = (pixel.b - mean) / std;

  return convertedBytes.buffer.asUint8List();



For on-device object detection, I would suggest you to use tflite_flutter.

What ever you use, you will need to normalize the image inputs. You can use something like this;

import &#39;package:image/image.dart&#39; as img; 
 Future&lt;Uint8List&gt; imageToByteListFloat32(
      img.Image image, int inputSize, double mean, double std) async {
    var convertedBytes = Float32List(1 * inputSize * inputSize * 3);
    var buffer = Float32List.view(convertedBytes.buffer);
    int pixelIndex = 0;
    for (var i = 0; i &lt; inputSize; i++) {
      for (var j = 0; j &lt; inputSize; j++) {
        var pixel = image.getPixel(j, i);
        buffer[pixelIndex++] = (pixel.r - mean) / std;
        buffer[pixelIndex++] = (pixel.g - mean) / std;
        buffer[pixelIndex++] = (pixel.b - mean) / std;

    return convertedBytes.buffer.asUint8List();

Then you will have to figure out the outputs. Use something like neutron to identify the output shape. So in your first YOLOv8 model, output shape is [1,13,13125]. 1- batch size, 13 - first 4 represents the bounding box coordinates (x,y,width,height) and remains denote the each class score (I assume you have 9 classes in total.), 13125 - represents the possible number of bounding boxes. So you will have to loop each an every box and filter what you need.

  • 本文由 发表于 2023年4月17日 10:20:50
  • 转载请务必保留本文链接:



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
