2023年7月17日 20:34:44go评论135阅读模式

英文:

Track Mouse (animal) in video using YOLO v8 trained on fiftyone.zoo dataset

问题

问题：

我正在尝试使用自定义数据集来训练一个 YOLO v8 模型，以便在视频中检测（并跟踪）老鼠，但结果不佳。你能帮助我改善模型的性能吗？

备注：模型的训练需要相当长的时间，我想请你提供一些改进性能的建议，以便我不会浪费太多时间在对整体性能几乎没有影响的参数进行更改或优化上。

基本详情：

我是一名研究员，对计算机视觉完全不熟悉。我正在进行一个实验，需要从摄像头（固定角度）中跟踪老鼠在笼子内的移动。我尝试使用 fiftyone.zoo 数据集 "open-images-v7" 来训练 YOLO v8 模型，但由于我是该领域的新手，这只是我的方法，所以我很乐意听取更好的建议：

import fiftyone as fo
from ultralytics import YOLO
from pathlib import Path
from tqdm import tqdm
import shutil
# Load the FiftyOne dataset
dataset = fo.zoo.load_zoo_dataset(
    "open-images-v7",
    split="train",
    label_types=["detections"],
    classes=["Mouse"],
    max_samples=100,
)
# Convert FiftyOne dataset to YOLO format
output_dir = Path("yolo_dataset")
output_dir.mkdir(exist_ok=True)
for sample in tqdm(dataset):
    img_path = sample.filepath
    img_filename = Path(img_path).name
    yolo_labels_path = output_dir / (Path(img_filename).stem + ".txt")
    with open(yolo_labels_path, "w") as f:
        for detection in sample.ground_truth.detections:
            if detection.label == "Mouse":
                bbox = detection.bounding_box
                x, y, width, height = bbox[0], bbox[1], bbox[2], bbox[3]
                x_center = x + width / 2
                y_center = y + height / 2
                yolo_label = f"0 {x_center} {y_center} {width} {height}\n"
                f.write(yolo_label)
    # Copy image file to the YOLO dataset folder
    shutil.copy(img_path, output_dir / img_filename)
# Load a model
model = YOLO('yolov8n.pt')
# Train the model with the YOLO dataset
model.train(data='config.yaml', epochs=100, device='mps')
# Track with the model
results = model.track(source="catmouse.mov", show=True)

我的 config.yaml 文件如下：

path: /home/path/to/code/folder 
train: yolo_dataset # train images (relative to 'path')
val: yolo_dataset # val images (relative to 'path')
# Classes
names:
    0: Mouse

至于视频 catmouse.mov 在这个示例中只是从 YouTube 上此视频的一部分提取出来的：https://youtu.be/6pbreU5ChmA。请随意使用任何其他包含老鼠的视频。

英文:

The problem:

I am trying to train a YOLO v8 model using a custom dataset to detect (and track) a mouse in a video but with poor results. Can you help me improve the performances of my model?

PS: The training of the model require a quite some time, I'm asking you for tips to improve the performances so I won't waste too much time changing or optimising parameters that have little or no effect to the overall performances of the model.

Essential details:

I'm a researcher, and I'm completely new to computer vision. I am running an experiment where I need to track a mouse's movements inside a cage from a camera (fixed angle). I am trying to train a YOLO v8 model using the fiftyone.zoo dataset "open-images-v7" however this is just my approach as a novice in the field so I'm happy to follow better suggestions:

import fiftyone as fo
from ultralytics import YOLO
from pathlib import Path
from tqdm import tqdm
import shutil
# Load the FiftyOne dataset
dataset = fo.zoo.load_zoo_dataset(
    &quot;open-images-v7&quot;,
    split=&quot;train&quot;,
    label_types=[&quot;detections&quot;],
    classes=[&quot;Mouse&quot;],
    max_samples=100,
)
# Convert FiftyOne dataset to YOLO format
output_dir = Path(&quot;yolo_dataset&quot;)
output_dir.mkdir(exist_ok=True)
for sample in tqdm(dataset):
    img_path = sample.filepath
    img_filename = Path(img_path).name
    yolo_labels_path = output_dir / (Path(img_filename).stem + &quot;.txt&quot;)
    with open(yolo_labels_path, &quot;w&quot;) as f:
        for detection in sample.ground_truth.detections:
            if detection.label == &quot;Mouse&quot;:
                bbox = detection.bounding_box
                x, y, width, height = bbox[0], bbox[1], bbox[2], bbox[3]
                x_center = x + width / 2
                y_center = y + height / 2
                yolo_label = f&quot;0 {x_center} {y_center} {width} {height}\n&quot;
                f.write(yolo_label)
    # Copy image file to the YOLO dataset folder
    shutil.copy(img_path, output_dir / img_filename)
# Load a model
model = YOLO(&#39;yolov8n.pt&#39;)
# Train the model with the YOLO dataset
model.train(data=&#39;config.yaml&#39;, epochs=100, device=&#39;mps&#39;)
# Track with the model
results = model.track(source=&quot;catmouse.mov&quot;, show=True)

my config.yaml file is:

path: /home/path/to/code/folder 
train: yolo_dataset # train images (relative to &#39;path&#39;)
val: yolo_dataset # val images (relative to &#39;path&#39;)
# Classes
names:
    0: Mouse

as for the video catmouse.mov in this example is just an extract of this video from YouTube: https://youtu.be/6pbreU5ChmA. Feel free to use any other video with a mouse/mice.

答案1

得分: 1

获取更多数据，可能 100 个示例不足以使模型泛化相关特征。

如果您能从实验中取一些帧，标记并将它们添加到数据集中，将会很有用。来自Open Images的示例可能与您的真实数据非常不同。如果您无法这样做，只需从数据集中获取更多示例。

在训练过程中启用YOLO数据增强可能会很有用，以使模型对视角、大小、颜色等不重要的特征更加强大。

如果您有足够的资源，可以尝试比v8n更复杂的模型，例如v8s甚至v8m。

获得最佳训练结果的提示：https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/?h=best

数据增强：https://docs.ultralytics.com/usage/cfg/#augmentation

英文:

Obtain more data, more likely 100 examples aren't enough for a model to generalize relevant features.

It will be useful if you can take some frames from your experiment, label and add them to the dataset. Examples from the Open Images can be very different from your real data. If you cannot do this, just take more examples from the dataset.

It can be useful to enable yolo data augmentation during the training process to make the model more robust to insignificant features like angle of view, size, color, etc.

If you have enough resources you can try more complex models than v8n, for example, v8s or even v8m.

Tips for Best training results: https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/?h=best

Data augmentation: https://docs.ultralytics.com/usage/cfg/#augmentation

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用在fiftyone.zoo数据集上训练的YOLO v8来在视频中跟踪鼠标（动物）。

问题

问题：

基本详情：

The problem:

Essential details:

答案1

Python解释器与全局环境之间的关系是什么？

strip()方法为什么会从单词列表中返回一个列表？

pip: bad interpreter: /../ no such file or directory

问题安装来自GitHub的Python程序

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。