使用在fiftyone.zoo数据集上训练的YOLO v8来在视频中跟踪鼠标(动物)。

huangapple go评论95阅读模式
英文:

Track Mouse (animal) in video using YOLO v8 trained on fiftyone.zoo dataset

问题

问题:

我正在尝试使用自定义数据集来训练一个 YOLO v8 模型,以便在视频中检测(并跟踪)老鼠,但结果不佳。你能帮助我改善模型的性能吗?

备注:模型的训练需要相当长的时间,我想请你提供一些改进性能的建议,以便我不会浪费太多时间在对整体性能几乎没有影响的参数进行更改或优化上。

基本详情:

我是一名研究员,对计算机视觉完全不熟悉。我正在进行一个实验,需要从摄像头(固定角度)中跟踪老鼠在笼子内的移动。我尝试使用 fiftyone.zoo 数据集 "open-images-v7" 来训练 YOLO v8 模型,但由于我是该领域的新手,这只是我的方法,所以我很乐意听取更好的建议:

import fiftyone as fo
from ultralytics import YOLO
from pathlib import Path
from tqdm import tqdm
import shutil

# Load the FiftyOne dataset
dataset = fo.zoo.load_zoo_dataset(
    "open-images-v7",
    split="train",
    label_types=["detections"],
    classes=["Mouse"],
    max_samples=100,
)

# Convert FiftyOne dataset to YOLO format
output_dir = Path("yolo_dataset")
output_dir.mkdir(exist_ok=True)

for sample in tqdm(dataset):
    img_path = sample.filepath
    img_filename = Path(img_path).name
    yolo_labels_path = output_dir / (Path(img_filename).stem + ".txt")

    with open(yolo_labels_path, "w") as f:
        for detection in sample.ground_truth.detections:
            if detection.label == "Mouse":
                bbox = detection.bounding_box
                x, y, width, height = bbox[0], bbox[1], bbox[2], bbox[3]
                x_center = x + width / 2
                y_center = y + height / 2
                yolo_label = f"0 {x_center} {y_center} {width} {height}\n"
                f.write(yolo_label)

    # Copy image file to the YOLO dataset folder
    shutil.copy(img_path, output_dir / img_filename)

# Load a model
model = YOLO('yolov8n.pt')

# Train the model with the YOLO dataset
model.train(data='config.yaml', epochs=100, device='mps')

# Track with the model
results = model.track(source="catmouse.mov", show=True)

我的 config.yaml 文件如下:

path: /home/path/to/code/folder 

train: yolo_dataset # train images (relative to 'path')
val: yolo_dataset # val images (relative to 'path')

# Classes
names:
    0: Mouse

至于视频 catmouse.mov 在这个示例中只是从 YouTube 上此视频的一部分提取出来的:https://youtu.be/6pbreU5ChmA。请随意使用任何其他包含老鼠的视频。

英文:

The problem:

I am trying to train a YOLO v8 model using a custom dataset to detect (and track) a mouse in a video but with poor results. Can you help me improve the performances of my model?

PS: The training of the model require a quite some time, I'm asking you for tips to improve the performances so I won't waste too much time changing or optimising parameters that have little or no effect to the overall performances of the model.

Essential details:

I'm a researcher, and I'm completely new to computer vision. I am running an experiment where I need to track a mouse's movements inside a cage from a camera (fixed angle). I am trying to train a YOLO v8 model using the fiftyone.zoo dataset "open-images-v7" however this is just my approach as a novice in the field so I'm happy to follow better suggestions:

import fiftyone as fo
from ultralytics import YOLO
from pathlib import Path
from tqdm import tqdm
import shutil

# Load the FiftyOne dataset
dataset = fo.zoo.load_zoo_dataset(
    "open-images-v7",
    split="train",
    label_types=["detections"],
    classes=["Mouse"],
    max_samples=100,
)

# Convert FiftyOne dataset to YOLO format
output_dir = Path("yolo_dataset")
output_dir.mkdir(exist_ok=True)

for sample in tqdm(dataset):
    img_path = sample.filepath
    img_filename = Path(img_path).name
    yolo_labels_path = output_dir / (Path(img_filename).stem + ".txt")

    with open(yolo_labels_path, "w") as f:
        for detection in sample.ground_truth.detections:
            if detection.label == "Mouse":
                bbox = detection.bounding_box
                x, y, width, height = bbox[0], bbox[1], bbox[2], bbox[3]
                x_center = x + width / 2
                y_center = y + height / 2
                yolo_label = f"0 {x_center} {y_center} {width} {height}\n"
                f.write(yolo_label)

    # Copy image file to the YOLO dataset folder
    shutil.copy(img_path, output_dir / img_filename)

# Load a model
model = YOLO('yolov8n.pt')

# Train the model with the YOLO dataset
model.train(data='config.yaml', epochs=100, device='mps')

# Track with the model
results = model.track(source="catmouse.mov", show=True)

my config.yaml file is:

path: /home/path/to/code/folder 

train: yolo_dataset # train images (relative to 'path')
val: yolo_dataset # val images (relative to 'path')

# Classes
names:
    0: Mouse

as for the video catmouse.mov in this example is just an extract of this video from YouTube: https://youtu.be/6pbreU5ChmA. Feel free to use any other video with a mouse/mice.

答案1

得分: 1

获取更多数据,可能 100 个示例不足以使模型泛化相关特征。

如果您能从实验中取一些帧,标记并将它们添加到数据集中,将会很有用。来自Open Images的示例可能与您的真实数据非常不同。如果您无法这样做,只需从数据集中获取更多示例。

在训练过程中启用YOLO数据增强可能会很有用,以使模型对视角、大小、颜色等不重要的特征更加强大。

如果您有足够的资源,可以尝试比v8n更复杂的模型,例如v8s甚至v8m。

获得最佳训练结果的提示:https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/?h=best

数据增强:https://docs.ultralytics.com/usage/cfg/#augmentation

英文:

Obtain more data, more likely 100 examples aren't enough for a model to generalize relevant features.

It will be useful if you can take some frames from your experiment, label and add them to the dataset. Examples from the Open Images can be very different from your real data. If you cannot do this, just take more examples from the dataset.

It can be useful to enable yolo data augmentation during the training process to make the model more robust to insignificant features like angle of view, size, color, etc.

If you have enough resources you can try more complex models than v8n, for example, v8s or even v8m.

Tips for Best training results: https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/?h=best

Data augmentation: https://docs.ultralytics.com/usage/cfg/#augmentation

huangapple
  • 本文由 发表于 2023年7月17日 20:34:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704505.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定