Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu – Unable to figure out which tensors are the problem

huangapple go评论61阅读模式
英文:

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu - Unable to figure out which tensors are the problem

问题

It looks like you're encountering a "Expected all tensors to be on the same device" error, which means that some of your tensors are on the CPU while others are on the GPU. To fix this issue, you need to ensure that all tensors involved in operations are on the same device, either CPU or GPU.

Here are some steps you can take to resolve this issue:

  1. Ensure that your model (detector), as well as all tensors related to it (including inputs and targets), are on the same device. You can move tensors to the GPU using the .to(device) method.

  2. Specifically, make sure that the gt_bboxes_proj, gt_classes, and anc_boxes_all tensors in the get_req_anchors function are on the same device as your model and inputs.

  3. If you're using any custom functions or modules within your model or training loop that work with tensors, ensure that they also operate on tensors on the same device.

  4. Double-check that you haven't inadvertently mixed CPU and GPU operations within your code.

  5. It might also be helpful to explicitly set the device for any tensor that you create to avoid device mismatches.

Here's an example of how you can move a tensor to a specific device:

# Move a tensor to the GPU
tensor_on_gpu = tensor_on_cpu.to(device)

# Move a tensor to the CPU
tensor_on_cpu = tensor_on_gpu.to('cpu')

By ensuring that all tensors are on the same device (either CPU or GPU), you should be able to resolve this issue.

英文:

I am trying to train an object detection model on a GPU. The code is written in Pytorch.
There are existing questions with a query about the same error but unfortunately none of them worked out for me.

The GPU device declaration is as follows:

device = torch.device("cuda:0")

My training loop is as follows:

detector = TwoStageDetector(img_size, out_size, out_c, n_classes, roi_size)
detector=detector.to(device)
#detector.eval()
#total_loss = detector(img_batch, gt_bboxes_batch, gt_classes_batch)
#proposals_final, conf_scores_final, classes_final = detector.inference(img_batch)

print("STARTING TRAINING")
def training_loop(model, learning_rate, train_dataloader, n_epochs):
    #model=model.to(device)
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    model.train()
    loss_list = []

    for i in tqdm(range(n_epochs)):
        total_loss = 0
        for img_batch, gt_bboxes_batch, gt_classes_batch in train_dataloader:
            img_batch=img_batch.to(device)
            gt_bboxes_batch=gt_bboxes_batch.to(device)
            gt_classes_batch=gt_classes_batch.to(device)
            # forward pass
            loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
            # backpropagation
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            total_loss += loss.item()

        loss_list.append(total_loss)

    return loss_list

learning_rate = 1e-3
n_epochs = 1000

loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)

The relevant model class from the model.py file is as follows:

class TwoStageDetector(nn.Module):
    def __init__(self, img_size, out_size, out_channels, n_classes, roi_size):
        super().__init__() 
        self.rpn = RegionProposalNetwork(img_size, out_size, out_channels)
        self.classifier = ClassificationModule(out_channels, n_classes, roi_size)
        
    def forward(self, images, gt_bboxes, gt_classes):
        total_rpn_loss, feature_map, proposals, \
        positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
        
        # get separate proposals for each sample
        pos_proposals_list = []
        batch_size = images.size(dim=0)
        for idx in range(batch_size):
            proposal_idxs = torch.where(positive_anc_ind_sep == idx)[0]
            proposals_sep = proposals[proposal_idxs].detach().clone()
            pos_proposals_list.append(proposals_sep)
        
        cls_loss = self.classifier(feature_map, pos_proposals_list, GT_class_pos)
        total_loss = cls_loss + total_rpn_loss
        
        return total_loss
    
    def inference(self, images, conf_thresh=0.5, nms_thresh=0.7):
        batch_size = images.size(dim=0)
        proposals_final, conf_scores_final, feature_map = self.rpn.inference(images, conf_thresh, nms_thresh)
        cls_scores = self.classifier(feature_map, proposals_final)
        
        # convert scores into probability
        cls_probs = F.softmax(cls_scores, dim=-1)
        # get classes with highest probability
        classes_all = torch.argmax(cls_probs, dim=-1)
        
        classes_final = []
        # slice classes to map to their corresponding image
        c = 0
        for i in range(batch_size):
            n_proposals = len(proposals_final[i]) # get the number of proposals for each image
            classes_final.append(classes_all[c: c+n_proposals])
            c += n_proposals
            
        return proposals_final, conf_scores_final, classes_final

class RegionProposalNetwork(nn.Module):
    def __init__(self, img_size, out_size, out_channels):
        super().__init__()

        self.img_height, self.img_width = img_size
        self.out_h, self.out_w = out_size

        # downsampling scale factor
        self.width_scale_factor = self.img_width // self.out_w
        self.height_scale_factor = self.img_height // self.out_h

        # scales and ratios for anchor boxes
        self.anc_scales = [2, 4, 6]
        self.anc_ratios = [0.5, 1, 1.5]
        self.n_anc_boxes = len(self.anc_scales) * len(self.anc_ratios)

        # IoU thresholds for +ve and -ve anchors
        self.pos_thresh = 0.7
        self.neg_thresh = 0.3

        # weights for loss
        self.w_conf = 1
        self.w_reg = 5

        self.feature_extractor = FeatureExtractor()
        self.proposal_module = ProposalModule(out_channels, n_anchors=self.n_anc_boxes)

    def forward(self, images, gt_bboxes, gt_classes):
        batch_size = images.size(dim=0)
        feature_map = self.feature_extractor(images)

        # generate anchors
        anc_pts_x, anc_pts_y = gen_anc_centers(out_size=(self.out_h, self.out_w))
        anc_base = gen_anc_base(anc_pts_x, anc_pts_y, self.anc_scales, self.anc_ratios, (self.out_h, self.out_w))
        anc_boxes_all = anc_base.repeat(batch_size, 1, 1, 1, 1)

        # get positive and negative anchors amongst other things
        gt_bboxes_proj = project_bboxes(gt_bboxes, self.width_scale_factor, self.height_scale_factor, mode='p2a')

        positive_anc_ind, negative_anc_ind, GT_conf_scores, \
        GT_offsets, GT_class_pos, positive_anc_coords, \
        negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)

        # pass through the proposal module
        conf_scores_pos, conf_scores_neg, offsets_pos, proposals = self.proposal_module(feature_map, positive_anc_ind, \
                                                                                        negative_anc_ind,
                                                                                        positive_anc_coords)

        cls_loss = calc_cls_loss(conf_scores_pos, conf_scores_neg, batch_size)
        reg_loss = calc_bbox_reg_loss(GT_offsets, offsets_pos, batch_size)

        total_rpn_loss = self.w_conf * cls_loss + self.w_reg * reg_loss

        return total_rpn_loss, feature_map, proposals, positive_anc_ind_sep, GT_class_pos

The get_iou_mat() function and get_req_anchors() function from the utils.py file are as follows:

def get_iou_mat(batch_size, anc_boxes_all, gt_bboxes_all):
    # flatten anchor boxes
    anc_boxes_flat = anc_boxes_all.reshape(batch_size, -1, 4)
    # get total anchor boxes for a single image
    tot_anc_boxes = anc_boxes_flat.size(dim=1)
    # create a placeholder to compute IoUs amongst the boxes
    ious_mat = torch.zeros((batch_size, tot_anc_boxes, gt_bboxes_all.size(dim=1)))

    # compute IoU of the anc boxes with the gt boxes for all the images
    for i in range(batch_size):
        gt_bboxes = gt_bboxes_all[i]
        #gt_bboxes = gt_bboxes[None, :]
        anc_boxes = anc_boxes_flat[i]
        ious_mat[i, :] = ops.box_iou(anc_boxes, gt_bboxes)
    return ious_mat


def get_req_anchors(anc_boxes_all, gt_bboxes_all, gt_classes_all, pos_thresh=0.7, neg_thresh=0.2):
    '''
    Prepare necessary data required for training

    Input
    ------
    anc_boxes_all - torch.Tensor of shape (B, w_amap, h_amap, n_anchor_boxes, 4)
        all anchor boxes for a batch of images
    gt_bboxes_all - torch.Tensor of shape (B, max_objects, 4)
        padded ground truth boxes for a batch of images
    gt_classes_all - torch.Tensor of shape (B, max_objects)
        padded ground truth classes for a batch of images

    Returns
    ---------
    positive_anc_ind -  torch.Tensor of shape (n_pos,)
        flattened positive indices for all the images in the batch
    negative_anc_ind - torch.Tensor of shape (n_pos,)
        flattened positive indices for all the images in the batch
    GT_conf_scores - torch.Tensor of shape (n_pos,), IoU scores of +ve anchors
    GT_offsets -  torch.Tensor of shape (n_pos, 4),
        offsets between +ve anchors and their corresponding ground truth boxes
    GT_class_pos - torch.Tensor of shape (n_pos,)
        mapped classes of +ve anchors
    positive_anc_coords - (n_pos, 4) coords of +ve anchors (for visualization)
    negative_anc_coords - (n_pos, 4) coords of -ve anchors (for visualization)
    positive_anc_ind_sep - list of indices to keep track of +ve anchors
    '''
    # get the size and shape parameters
    B, w_amap, h_amap, A, _ = anc_boxes_all.shape
    N = gt_bboxes_all.shape[1]  # max number of groundtruth bboxes in a batch

    # get total number of anchor boxes in a single image
    tot_anc_boxes = A * w_amap * h_amap

    # get the iou matrix which contains iou of every anchor box
    # against all the groundtruth bboxes in an image
    iou_mat = get_iou_mat(B, anc_boxes_all, gt_bboxes_all)
    #print(iou_mat.shape)
    # for every groundtruth bbox in an image, find the iou
    # with the anchor box which it overlaps the most
    max_iou_per_gt_box, _ = iou_mat.max(dim=1, keepdim=True)
    #print(max_iou_per_gt_box.shape)
    #print(max_iou_per_gt_box)
    # get positive anchor boxes

    # condition 1: the anchor box with the max iou for every gt bbox
    #print(max_iou_per_gt_box > 0)
    positive_anc_mask = torch.logical_and(iou_mat == max_iou_per_gt_box, max_iou_per_gt_box > 0)
    #print(positive_anc_mask.shape)
    # condition 2: anchor boxes with iou above a threshold with any of the gt bboxes
    positive_anc_mask = torch.logical_or(positive_anc_mask, iou_mat > pos_thresh)
    #print(positive_anc_mask.shape)

    positive_anc_ind_sep = torch.where(positive_anc_mask)[0]  # get separate indices in the batch
    # combine all the batches and get the idxs of the +ve anchor boxes
    positive_anc_mask = positive_anc_mask.flatten(start_dim=0, end_dim=1)
    positive_anc_ind = torch.where(positive_anc_mask)[0]

    # for every anchor box, get the iou and the idx of the
    # gt bbox it overlaps with the most
    max_iou_per_anc, max_iou_per_anc_ind = iou_mat.max(dim=-1)
    max_iou_per_anc = max_iou_per_anc.flatten(start_dim=0, end_dim=1)

    # get iou scores of the +ve anchor boxes
    GT_conf_scores = max_iou_per_anc[positive_anc_ind]

    # get gt classes of the +ve anchor boxes

    # expand gt classes to map against every anchor box
    #print(gt_classes_all.shape)
    gt_classes_expand = gt_classes_all.view(B, 1, N).expand(B, tot_anc_boxes, N)
    # for every anchor box, consider only the class of the gt bbox it overlaps with the most
    GT_class = torch.gather(gt_classes_expand, -1, max_iou_per_anc_ind.unsqueeze(-1)).squeeze(-1)
    # combine all the batches and get the mapped classes of the +ve anchor boxes
    GT_class = GT_class.flatten(start_dim=0, end_dim=1)
    GT_class_pos = GT_class[positive_anc_ind]

    # get gt bbox coordinates of the +ve anchor boxes

    # expand all the gt bboxes to map against every anchor box
    gt_bboxes_expand = gt_bboxes_all.view(B, 1, N, 4).expand(B, tot_anc_boxes, N, 4)
    # for every anchor box, consider only the coordinates of the gt bbox it overlaps with the most
    GT_bboxes = torch.gather(gt_bboxes_expand, -2,
                             max_iou_per_anc_ind.reshape(B, tot_anc_boxes, 1, 1).repeat(1, 1, 1, 4))
    # combine all the batches and get the mapped gt bbox coordinates of the +ve anchor boxes
    GT_bboxes = GT_bboxes.flatten(start_dim=0, end_dim=2)
    GT_bboxes_pos = GT_bboxes[positive_anc_ind]

    # get coordinates of +ve anc boxes
    anc_boxes_flat = anc_boxes_all.flatten(start_dim=0, end_dim=-2)  # flatten all the anchor boxes
    positive_anc_coords = anc_boxes_flat[positive_anc_ind]

    # calculate gt offsets
    GT_offsets = calc_gt_offsets(positive_anc_coords, GT_bboxes_pos)

    # get -ve anchors

    # condition: select the anchor boxes with max iou less than the threshold
    negative_anc_mask = (max_iou_per_anc < neg_thresh)
    negative_anc_ind = torch.where(negative_anc_mask)[0]
    # sample -ve samples to match the +ve samples
    negative_anc_ind = negative_anc_ind[torch.randint(0, negative_anc_ind.shape[0], (positive_anc_ind.shape[0],))]
    negative_anc_coords = anc_boxes_flat[negative_anc_ind]

    return positive_anc_ind, negative_anc_ind, GT_conf_scores, GT_offsets, GT_class_pos, \
           positive_anc_coords, negative_anc_coords, positive_anc_ind_sep

I have not placed any tensors from the utils.py file in the GPU explicitly.

The error stack trace is as follows:

File "/home/main.py", line 353, in <module>
    loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)
  File "/home/main.py", line 336, in training_loop
    loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/model.py", line 215, in forward
    positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/model.py", line 104, in forward
    negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)
  File "/home/utils.py", line 222, in get_req_anchors
    iou_mat = get_iou_mat(B, anc_boxes_all, gt_bboxes_all)
  File "/home/utils.py", line 181, in get_iou_mat
    ious_mat[i, :] = ops.box_iou(anc_boxes, gt_bboxes)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torchvision/ops/boxes.py", line 271, in box_iou
    inter, union = _box_inter_union(boxes1, boxes2)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torchvision/ops/boxes.py", line 244, in _box_inter_union
    lt = torch.max(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I have tried sending different tensors to the GPU in an attempt to get rid of this error, but without success. It would be great if someone could indicate what I am doing wrong. Thanks

UPDATE
I sent the anc_boxes_all variable in RegionProposalNetwork to the GPU, and this fixed the above error. But is now giving me the same error for something else:

Traceback (most recent call last):
  File "/home/main.py", line 352, in <module>
    loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)
  File "/home/main.py", line 336, in training_loop
    loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/model.py", line 207, in forward
    positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
  File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/model.py", line 104, in forward
    negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)
  File "/home/utils.py", line 258, in get_req_anchors
    GT_class = torch.gather(gt_classes_expand, -1, max_iou_per_anc_ind.unsqueeze(-1)).squeeze(-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

答案1

得分: 1

lt = torch.max(boxes1[:, None, :2], boxes2[:, :2]) # [N,M,2]

boxes1 在 GPU 0 上,boxes2 在 CPU 上。

你可以展示一下你是如何定义你的模型、boxes1boxes2 的吗?

你还可以展示一下你是在哪里定义 device 吗?

英文:
 lt = torch.max(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2]

boxes1 is on GPU zero and boxes2 is on CPU.

Can you show how you define your model, boxes1 and boxes2?

Can you also show where you define device?

huangapple
  • 本文由 发表于 2023年5月10日 22:19:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76219530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定