PyTorch中的数据增强用于CNN。

huangapple go评论59阅读模式
英文:

Data augmentation in Pytorch for CNN

问题

我想对我的图像集进行数据增强,以便在PyTorch中训练卷积神经网络时有更多的数据。

变换示例:

train_transforms = Compose([LoadImage(image_only=True), EnsureChannelFirst(), ScaleIntensity(), RandRotate(range_x=np.pi / 12, prob=0.5, keep_size=True), RandFlip(spatial_axis=0, prob=0.5)])

在我理解的PyTorch中,变换会对图像进行处理,但处理后的图像是唯一使用的,而不使用原始图像。我希望对数据进行转换,然后同时使用原始数据和转换后的数据,因为我的目标是增加数据。但是,我们如何通过应用这些变换来实际增加输入数据的数量呢?如果我想要使用翻转进行数据增强(例如),我希望同时使用原始数据和转换后的数据(以便用更多的数据来训练模型)。

我尝试将变换添加到我的数据中,但似乎只使用了转换后的数据,数据发生了变化,但并没有增加。

英文:

I want to do data augmentation to my set of images in order to have more data to train a convolutional neural network in Pytorch.

Example of transnformations:

 train_transforms = Compose([LoadImage(image_only=True),EnsureChannelFirst(),ScaleIntensity(),RandRotate(range_x=np.pi / 12, prob=0.5, keep_size=True),RandFlip(spatial_axis=0, prob=0.5)]

The transforms in Pytorch, as I understand, make a transformation of the image but then the transformed image is the only one used, and no the original one. I want to do transformations to my data and then use the original one and the transformed one, as my objective is to augment the data...But then, how can we actually increment the number of input data by applying these transformations? If I want to do data augmentation with flip (for example), I want to use my original data and the transformed one (in order to train the model with more data).

I tried to add transformations to my data but it seems like the transformed data is the only one used, obtaining changes on the data but not an increase of it.

答案1

得分: 0

如果您希望同时使用原始数据和增强数据,您可以将它们连接起来,然后创建一个数据加载器来使用它们。所以步骤如下:

  1. 创建一个带有数据增强的数据集。
  2. 创建一个不带数据增强的数据集。
  3. 通过连接两者创建一个数据集。
  4. 使用连接后的数据集创建一个数据加载器。

我猜您已经知道如何创建带有数据增强的数据集。要连接多个数据集,您可以使用:

from torch.utils.data import ConcatDataset
concat_dataset = ConcatDataset([dataset1, dataset2])

这里有更多信息:
链接

英文:

If you want your original data and augmented data at same time, you can just concatenate them and then create a dataloader to use them. So the steps are these:

  1. Create a dataset with data augmentations.
  2. Create a dataset without data augmentations.
  3. Create a dataset by concatenating both.
  4. Create a dataloader with the concatenated dataset.

I guess you already know how to create datasets with data augmentation. To concatenate several datasets you can use:

from torch.utils.data import ConcatDataset
concat_dataset = ConcatDataset([dataset1, dataset2])

Here you have more information:
https://discuss.pytorch.org/t/how-does-concatdataset-work/60083/2

答案2

得分: 0

在你的torch数据集类中,你可以检查索引是否大于数据集的长度,然后返回一个增强过的图像。

class ExampleDataset(Dataset):
    def __init__(self):
        self.data = ...
        self.real_length = len(self.data)
        self.length = self.real_length * 2
    
    def __len__(self):
        return self.length
    
    def __getitem__(self, idx):
        if idx < self.real_length:
            return self.data[idx]
        else:
            return augment(self.data[idx - self.real_length])

你可以根据你想要进行的增强操作,多次扩展你的数据集(3次、4次)。

英文:

In your torch dataset class you can check if the index is bigger than the length of your dataset, then you can return an augmented image.

class ExampleDataset(Dataset):
    def __init__(self):
        self.data = ...
        self.real_length = len(self.data)
        self.length = self.real_length * 2
    
    def __len__(self):
        return self.length
    
    def __getitem__(self, idx):
        if idx &lt; self.real_length:
            return self.data[idx]
        else:
            return augment(self.data[idx - self.real_length])

You can extend your data more times (3, 4) depending on the augmentations you want to do.

huangapple
  • 本文由 发表于 2023年4月13日 18:39:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76004465.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定