Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu – Unable to figure out which tensors are the problem

huangapple go评论86阅读模式
英文:

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu - Unable to figure out which tensors are the problem

问题

It looks like you're encountering a "Expected all tensors to be on the same device" error, which means that some of your tensors are on the CPU while others are on the GPU. To fix this issue, you need to ensure that all tensors involved in operations are on the same device, either CPU or GPU.

Here are some steps you can take to resolve this issue:

  1. Ensure that your model (detector), as well as all tensors related to it (including inputs and targets), are on the same device. You can move tensors to the GPU using the .to(device) method.

  2. Specifically, make sure that the gt_bboxes_proj, gt_classes, and anc_boxes_all tensors in the get_req_anchors function are on the same device as your model and inputs.

  3. If you're using any custom functions or modules within your model or training loop that work with tensors, ensure that they also operate on tensors on the same device.

  4. Double-check that you haven't inadvertently mixed CPU and GPU operations within your code.

  5. It might also be helpful to explicitly set the device for any tensor that you create to avoid device mismatches.

Here's an example of how you can move a tensor to a specific device:

  1. # Move a tensor to the GPU
  2. tensor_on_gpu = tensor_on_cpu.to(device)
  3. # Move a tensor to the CPU
  4. tensor_on_cpu = tensor_on_gpu.to('cpu')

By ensuring that all tensors are on the same device (either CPU or GPU), you should be able to resolve this issue.

英文:

I am trying to train an object detection model on a GPU. The code is written in Pytorch.
There are existing questions with a query about the same error but unfortunately none of them worked out for me.

The GPU device declaration is as follows:

  1. device = torch.device("cuda:0")

My training loop is as follows:

  1. detector = TwoStageDetector(img_size, out_size, out_c, n_classes, roi_size)
  2. detector=detector.to(device)
  3. #detector.eval()
  4. #total_loss = detector(img_batch, gt_bboxes_batch, gt_classes_batch)
  5. #proposals_final, conf_scores_final, classes_final = detector.inference(img_batch)
  6. print("STARTING TRAINING")
  7. def training_loop(model, learning_rate, train_dataloader, n_epochs):
  8. #model=model.to(device)
  9. optimizer = optim.Adam(model.parameters(), lr=learning_rate)
  10. model.train()
  11. loss_list = []
  12. for i in tqdm(range(n_epochs)):
  13. total_loss = 0
  14. for img_batch, gt_bboxes_batch, gt_classes_batch in train_dataloader:
  15. img_batch=img_batch.to(device)
  16. gt_bboxes_batch=gt_bboxes_batch.to(device)
  17. gt_classes_batch=gt_classes_batch.to(device)
  18. # forward pass
  19. loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
  20. # backpropagation
  21. optimizer.zero_grad()
  22. loss.backward()
  23. optimizer.step()
  24. total_loss += loss.item()
  25. loss_list.append(total_loss)
  26. return loss_list
  27. learning_rate = 1e-3
  28. n_epochs = 1000
  29. loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)

The relevant model class from the model.py file is as follows:

  1. class TwoStageDetector(nn.Module):
  2. def __init__(self, img_size, out_size, out_channels, n_classes, roi_size):
  3. super().__init__()
  4. self.rpn = RegionProposalNetwork(img_size, out_size, out_channels)
  5. self.classifier = ClassificationModule(out_channels, n_classes, roi_size)
  6. def forward(self, images, gt_bboxes, gt_classes):
  7. total_rpn_loss, feature_map, proposals, \
  8. positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
  9. # get separate proposals for each sample
  10. pos_proposals_list = []
  11. batch_size = images.size(dim=0)
  12. for idx in range(batch_size):
  13. proposal_idxs = torch.where(positive_anc_ind_sep == idx)[0]
  14. proposals_sep = proposals[proposal_idxs].detach().clone()
  15. pos_proposals_list.append(proposals_sep)
  16. cls_loss = self.classifier(feature_map, pos_proposals_list, GT_class_pos)
  17. total_loss = cls_loss + total_rpn_loss
  18. return total_loss
  19. def inference(self, images, conf_thresh=0.5, nms_thresh=0.7):
  20. batch_size = images.size(dim=0)
  21. proposals_final, conf_scores_final, feature_map = self.rpn.inference(images, conf_thresh, nms_thresh)
  22. cls_scores = self.classifier(feature_map, proposals_final)
  23. # convert scores into probability
  24. cls_probs = F.softmax(cls_scores, dim=-1)
  25. # get classes with highest probability
  26. classes_all = torch.argmax(cls_probs, dim=-1)
  27. classes_final = []
  28. # slice classes to map to their corresponding image
  29. c = 0
  30. for i in range(batch_size):
  31. n_proposals = len(proposals_final[i]) # get the number of proposals for each image
  32. classes_final.append(classes_all[c: c+n_proposals])
  33. c += n_proposals
  34. return proposals_final, conf_scores_final, classes_final
  35. class RegionProposalNetwork(nn.Module):
  36. def __init__(self, img_size, out_size, out_channels):
  37. super().__init__()
  38. self.img_height, self.img_width = img_size
  39. self.out_h, self.out_w = out_size
  40. # downsampling scale factor
  41. self.width_scale_factor = self.img_width // self.out_w
  42. self.height_scale_factor = self.img_height // self.out_h
  43. # scales and ratios for anchor boxes
  44. self.anc_scales = [2, 4, 6]
  45. self.anc_ratios = [0.5, 1, 1.5]
  46. self.n_anc_boxes = len(self.anc_scales) * len(self.anc_ratios)
  47. # IoU thresholds for +ve and -ve anchors
  48. self.pos_thresh = 0.7
  49. self.neg_thresh = 0.3
  50. # weights for loss
  51. self.w_conf = 1
  52. self.w_reg = 5
  53. self.feature_extractor = FeatureExtractor()
  54. self.proposal_module = ProposalModule(out_channels, n_anchors=self.n_anc_boxes)
  55. def forward(self, images, gt_bboxes, gt_classes):
  56. batch_size = images.size(dim=0)
  57. feature_map = self.feature_extractor(images)
  58. # generate anchors
  59. anc_pts_x, anc_pts_y = gen_anc_centers(out_size=(self.out_h, self.out_w))
  60. anc_base = gen_anc_base(anc_pts_x, anc_pts_y, self.anc_scales, self.anc_ratios, (self.out_h, self.out_w))
  61. anc_boxes_all = anc_base.repeat(batch_size, 1, 1, 1, 1)
  62. # get positive and negative anchors amongst other things
  63. gt_bboxes_proj = project_bboxes(gt_bboxes, self.width_scale_factor, self.height_scale_factor, mode='p2a')
  64. positive_anc_ind, negative_anc_ind, GT_conf_scores, \
  65. GT_offsets, GT_class_pos, positive_anc_coords, \
  66. negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)
  67. # pass through the proposal module
  68. conf_scores_pos, conf_scores_neg, offsets_pos, proposals = self.proposal_module(feature_map, positive_anc_ind, \
  69. negative_anc_ind,
  70. positive_anc_coords)
  71. cls_loss = calc_cls_loss(conf_scores_pos, conf_scores_neg, batch_size)
  72. reg_loss = calc_bbox_reg_loss(GT_offsets, offsets_pos, batch_size)
  73. total_rpn_loss = self.w_conf * cls_loss + self.w_reg * reg_loss
  74. return total_rpn_loss, feature_map, proposals, positive_anc_ind_sep, GT_class_pos

The get_iou_mat() function and get_req_anchors() function from the utils.py file are as follows:

  1. def get_iou_mat(batch_size, anc_boxes_all, gt_bboxes_all):
  2. # flatten anchor boxes
  3. anc_boxes_flat = anc_boxes_all.reshape(batch_size, -1, 4)
  4. # get total anchor boxes for a single image
  5. tot_anc_boxes = anc_boxes_flat.size(dim=1)
  6. # create a placeholder to compute IoUs amongst the boxes
  7. ious_mat = torch.zeros((batch_size, tot_anc_boxes, gt_bboxes_all.size(dim=1)))
  8. # compute IoU of the anc boxes with the gt boxes for all the images
  9. for i in range(batch_size):
  10. gt_bboxes = gt_bboxes_all[i]
  11. #gt_bboxes = gt_bboxes[None, :]
  12. anc_boxes = anc_boxes_flat[i]
  13. ious_mat[i, :] = ops.box_iou(anc_boxes, gt_bboxes)
  14. return ious_mat
  15. def get_req_anchors(anc_boxes_all, gt_bboxes_all, gt_classes_all, pos_thresh=0.7, neg_thresh=0.2):
  16. '''
  17. Prepare necessary data required for training
  18. Input
  19. ------
  20. anc_boxes_all - torch.Tensor of shape (B, w_amap, h_amap, n_anchor_boxes, 4)
  21. all anchor boxes for a batch of images
  22. gt_bboxes_all - torch.Tensor of shape (B, max_objects, 4)
  23. padded ground truth boxes for a batch of images
  24. gt_classes_all - torch.Tensor of shape (B, max_objects)
  25. padded ground truth classes for a batch of images
  26. Returns
  27. ---------
  28. positive_anc_ind - torch.Tensor of shape (n_pos,)
  29. flattened positive indices for all the images in the batch
  30. negative_anc_ind - torch.Tensor of shape (n_pos,)
  31. flattened positive indices for all the images in the batch
  32. GT_conf_scores - torch.Tensor of shape (n_pos,), IoU scores of +ve anchors
  33. GT_offsets - torch.Tensor of shape (n_pos, 4),
  34. offsets between +ve anchors and their corresponding ground truth boxes
  35. GT_class_pos - torch.Tensor of shape (n_pos,)
  36. mapped classes of +ve anchors
  37. positive_anc_coords - (n_pos, 4) coords of +ve anchors (for visualization)
  38. negative_anc_coords - (n_pos, 4) coords of -ve anchors (for visualization)
  39. positive_anc_ind_sep - list of indices to keep track of +ve anchors
  40. '''
  41. # get the size and shape parameters
  42. B, w_amap, h_amap, A, _ = anc_boxes_all.shape
  43. N = gt_bboxes_all.shape[1] # max number of groundtruth bboxes in a batch
  44. # get total number of anchor boxes in a single image
  45. tot_anc_boxes = A * w_amap * h_amap
  46. # get the iou matrix which contains iou of every anchor box
  47. # against all the groundtruth bboxes in an image
  48. iou_mat = get_iou_mat(B, anc_boxes_all, gt_bboxes_all)
  49. #print(iou_mat.shape)
  50. # for every groundtruth bbox in an image, find the iou
  51. # with the anchor box which it overlaps the most
  52. max_iou_per_gt_box, _ = iou_mat.max(dim=1, keepdim=True)
  53. #print(max_iou_per_gt_box.shape)
  54. #print(max_iou_per_gt_box)
  55. # get positive anchor boxes
  56. # condition 1: the anchor box with the max iou for every gt bbox
  57. #print(max_iou_per_gt_box > 0)
  58. positive_anc_mask = torch.logical_and(iou_mat == max_iou_per_gt_box, max_iou_per_gt_box > 0)
  59. #print(positive_anc_mask.shape)
  60. # condition 2: anchor boxes with iou above a threshold with any of the gt bboxes
  61. positive_anc_mask = torch.logical_or(positive_anc_mask, iou_mat > pos_thresh)
  62. #print(positive_anc_mask.shape)
  63. positive_anc_ind_sep = torch.where(positive_anc_mask)[0] # get separate indices in the batch
  64. # combine all the batches and get the idxs of the +ve anchor boxes
  65. positive_anc_mask = positive_anc_mask.flatten(start_dim=0, end_dim=1)
  66. positive_anc_ind = torch.where(positive_anc_mask)[0]
  67. # for every anchor box, get the iou and the idx of the
  68. # gt bbox it overlaps with the most
  69. max_iou_per_anc, max_iou_per_anc_ind = iou_mat.max(dim=-1)
  70. max_iou_per_anc = max_iou_per_anc.flatten(start_dim=0, end_dim=1)
  71. # get iou scores of the +ve anchor boxes
  72. GT_conf_scores = max_iou_per_anc[positive_anc_ind]
  73. # get gt classes of the +ve anchor boxes
  74. # expand gt classes to map against every anchor box
  75. #print(gt_classes_all.shape)
  76. gt_classes_expand = gt_classes_all.view(B, 1, N).expand(B, tot_anc_boxes, N)
  77. # for every anchor box, consider only the class of the gt bbox it overlaps with the most
  78. GT_class = torch.gather(gt_classes_expand, -1, max_iou_per_anc_ind.unsqueeze(-1)).squeeze(-1)
  79. # combine all the batches and get the mapped classes of the +ve anchor boxes
  80. GT_class = GT_class.flatten(start_dim=0, end_dim=1)
  81. GT_class_pos = GT_class[positive_anc_ind]
  82. # get gt bbox coordinates of the +ve anchor boxes
  83. # expand all the gt bboxes to map against every anchor box
  84. gt_bboxes_expand = gt_bboxes_all.view(B, 1, N, 4).expand(B, tot_anc_boxes, N, 4)
  85. # for every anchor box, consider only the coordinates of the gt bbox it overlaps with the most
  86. GT_bboxes = torch.gather(gt_bboxes_expand, -2,
  87. max_iou_per_anc_ind.reshape(B, tot_anc_boxes, 1, 1).repeat(1, 1, 1, 4))
  88. # combine all the batches and get the mapped gt bbox coordinates of the +ve anchor boxes
  89. GT_bboxes = GT_bboxes.flatten(start_dim=0, end_dim=2)
  90. GT_bboxes_pos = GT_bboxes[positive_anc_ind]
  91. # get coordinates of +ve anc boxes
  92. anc_boxes_flat = anc_boxes_all.flatten(start_dim=0, end_dim=-2) # flatten all the anchor boxes
  93. positive_anc_coords = anc_boxes_flat[positive_anc_ind]
  94. # calculate gt offsets
  95. GT_offsets = calc_gt_offsets(positive_anc_coords, GT_bboxes_pos)
  96. # get -ve anchors
  97. # condition: select the anchor boxes with max iou less than the threshold
  98. negative_anc_mask = (max_iou_per_anc < neg_thresh)
  99. negative_anc_ind = torch.where(negative_anc_mask)[0]
  100. # sample -ve samples to match the +ve samples
  101. negative_anc_ind = negative_anc_ind[torch.randint(0, negative_anc_ind.shape[0], (positive_anc_ind.shape[0],))]
  102. negative_anc_coords = anc_boxes_flat[negative_anc_ind]
  103. return positive_anc_ind, negative_anc_ind, GT_conf_scores, GT_offsets, GT_class_pos, \
  104. positive_anc_coords, negative_anc_coords, positive_anc_ind_sep

I have not placed any tensors from the utils.py file in the GPU explicitly.

The error stack trace is as follows:

  1. File "/home/main.py", line 353, in <module>
  2. loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)
  3. File "/home/main.py", line 336, in training_loop
  4. loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
  5. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  6. return forward_call(*args, **kwargs)
  7. File "/home/model.py", line 215, in forward
  8. positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
  9. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  10. return forward_call(*args, **kwargs)
  11. File "/home/model.py", line 104, in forward
  12. negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)
  13. File "/home/utils.py", line 222, in get_req_anchors
  14. iou_mat = get_iou_mat(B, anc_boxes_all, gt_bboxes_all)
  15. File "/home/utils.py", line 181, in get_iou_mat
  16. ious_mat[i, :] = ops.box_iou(anc_boxes, gt_bboxes)
  17. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torchvision/ops/boxes.py", line 271, in box_iou
  18. inter, union = _box_inter_union(boxes1, boxes2)
  19. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torchvision/ops/boxes.py", line 244, in _box_inter_union
  20. lt = torch.max(boxes1[:, None, :2], boxes2[:, :2]) # [N,M,2]
  21. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I have tried sending different tensors to the GPU in an attempt to get rid of this error, but without success. It would be great if someone could indicate what I am doing wrong. Thanks

UPDATE
I sent the anc_boxes_all variable in RegionProposalNetwork to the GPU, and this fixed the above error. But is now giving me the same error for something else:

  1. Traceback (most recent call last):
  2. File "/home/main.py", line 352, in <module>
  3. loss_list = training_loop(detector, learning_rate, od_dataloader, n_epochs)
  4. File "/home/main.py", line 336, in training_loop
  5. loss = model(img_batch, gt_bboxes_batch, gt_classes_batch)
  6. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  7. return forward_call(*args, **kwargs)
  8. File "/home/model.py", line 207, in forward
  9. positive_anc_ind_sep, GT_class_pos = self.rpn(images, gt_bboxes, gt_classes)
  10. File "/home/miniconda3/envs/pytor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  11. return forward_call(*args, **kwargs)
  12. File "/home/model.py", line 104, in forward
  13. negative_anc_coords, positive_anc_ind_sep = get_req_anchors(anc_boxes_all, gt_bboxes_proj, gt_classes)
  14. File "/home/utils.py", line 258, in get_req_anchors
  15. GT_class = torch.gather(gt_classes_expand, -1, max_iou_per_anc_ind.unsqueeze(-1)).squeeze(-1)
  16. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

答案1

得分: 1

lt = torch.max(boxes1[:, None, :2], boxes2[:, :2]) # [N,M,2]

boxes1 在 GPU 0 上,boxes2 在 CPU 上。

你可以展示一下你是如何定义你的模型、boxes1boxes2 的吗?

你还可以展示一下你是在哪里定义 device 吗?

英文:
  1. lt = torch.max(boxes1[:, None, :2], boxes2[:, :2]) # [N,M,2]

boxes1 is on GPU zero and boxes2 is on CPU.

Can you show how you define your model, boxes1 and boxes2?

Can you also show where you define device?

huangapple
  • 本文由 发表于 2023年5月10日 22:19:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76219530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定