2023年5月26日 13:19:24go评论57阅读模式

英文:

How to train an ML model to generate predictions for multiple regions in one image

问题

我正在参与一个家庭项目，帮助一位朋友管理一个有4个装卸区的工厂。他有一台能够看到所有4个装卸区的摄像头。我们想要创建一个模型，用于接收图像并告诉我们装卸区1是否占用，装卸区2是否空闲，装卸区3是否被堵塞等等。

我之前稍微涉猎过机器学习，并成功创建了一个模型，可以告诉我我的车库门是否打开。这里的难点在于，我希望训练模型以在每个图像中识别4种不同的状态，而且当模型运行时，我希望它能够识别图像中的每个装卸区并告诉我每个装卸区的状态。

我希望人工智能能够帮我完成所有这些，但我也意识到可能需要更多的工作！

另一个选择是，我可以将图像分割成四个部分，使每个图像都被切成4份，然后只需将预测模型输入一张图像，然后询问状态。这是否是更明智的方法（尽管自动执行可能会比较麻烦）？

英文:

I am working on a home project helping out a friend of mine who runs a factory with 4 loading docks. He has a camera that can see all 4 loading docks. We want to create a model to take the image and say bay 1 occupied, bay 2 free, bay 3 blocked ect...

I have played with ML a little bit before and have managed to create a model that can tell me that my garage door is open. The difficulty here is that I want to train the model with 4 different statuses in each image and when the model is up and running I want it to identify each loading bay in the image and give me the status of each.

I am hoping that the AI can do this all for me but I realise I might need to do more work for it!

Another option was that I could split the images up so that each image was sliced into 4, I could then just feed the prediction model an image and ask for 1 status back. Is this a more sensible approach (despite it being a pain for me to do this automatically)?

答案1

得分: 0

根据评论中讨论的第二个选项，您可以根据四个舱室的固定边界框来拆分图像，并在该数据上训练您的模型。您可以参考这篇文章，其中讨论了从图片中提取对象，并正如@guillaume所提到的，如果您的舱室相似，您将需要更少的数据。然后，您可以使用从所有舱室提取的图像进行预测。

英文:

As discussed in the comments, going by 2nd option: you can split the images based on the fixed bounding boxes of the four bay and train your model on that data. You can refer to this article which discusses extracting objects from a picture, and as mentioned by @guillaume if your bays are similar you will require much less data. You can then use extracted images from all bays and perform prediction.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何训练一个机器学习模型来为一幅图像中的多个区域生成预测。

问题

答案1

当我尝试运行下面提供的代码时，我遇到了KeyError。为什么会这样？

如何处理yolov8中`model.predict`的结果？

将OpenCV（C++）中的单个像素从RGB转换为LAB。

如何将数据框架转换为机器学习格式

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论