英文:
How to train YOLO-NAS model in azure-ml using Blob storage account
问题
抱歉,您提供的信息不足以提供准确的帮助。您似乎遇到了一个在Azure Machine Learning中加载数据时的错误。根据您提供的代码和错误消息,有一些可能的问题和建议:
-
数据路径错误:错误消息指出数据目录
data_dir
未找到。请确保DATA_DIR
变量的值与数据存储的实际路径匹配。您可以在 Azure 门户中检查数据存储的路径。 -
数据存储连接:确保您的 Azure Machine Learning 工作区与数据存储正确连接。在 Azure 门户中,您可以查看工作区设置和数据存储设置,确保它们配置正确。
-
权限问题:检查是否具有足够的权限来访问数据存储中的文件。Azure 存储帐户可能需要适当的访问权限设置。
-
数据格式问题:确保数据存储中的文件和目录结构与您的代码期望的一致。在您的数据存储中,需要有
train/images
、train/labels
、valid/images
、valid/labels
、test/images
和test/labels
这些子目录。 -
Azure Machine Learning 挂载:根据您提供的图像,似乎您使用了 Azure Machine Learning 的数据挂载功能。确保挂载点配置正确,并且工作区中的挂载点映射到正确的数据存储路径。
请仔细检查这些问题,并确保您的代码和配置与实际情况相匹配。如果您需要更详细的帮助,请提供更多关于您的 Azure Machine Learning 工作区和数据存储设置的信息。
英文:
I want to train a yolo-nas model for object-detection in azure-ml, where my train and test data is in blob storage account.
So in the config details we will be specifying the data directories will be blob storage account path for train, test and valid.
now I want to know how to feed the data to yolo-nas model or yolo-nas will pull the data for data loading, preprocessing and training.
config example: -
DATA_DIR = 'azureml://subscriptions/xxxxxxxx/resourcegroups/Faucet-poc/workspaces/faucet-ob-poc/datastores/faucetpoc/paths/' #parent directory to where data lives
TRAIN_IMAGES_DIR = 'train/images' #child dir of DATA_DIR where train images are
TRAIN_LABELS_DIR = 'train/labels' #child dir of DATA_DIR where train labels are
VAL_IMAGES_DIR = 'valid/images' #child dir of DATA_DIR where validation images are
VAL_LABELS_DIR = 'valid/labels' #child dir of DATA_DIR where validation labels are
# if you have a test set
TEST_IMAGES_DIR = 'test/images/' #child dir of DATA_DIR where test images are
TEST_LABELS_DIR = 'test/labels' #child dir of DATA_DIR where test labels are
CLASSES = ['faucet','sink','toilet'] #what class names do you have
NUM_CLASSES = len(CLASSES)
dataset_params = {
'data_dir': DATA_DIR,
'train_images_dir':TRAIN_IMAGES_DIR,
'train_labels_dir':TRAIN_LABELS_DIR,
'val_images_dir':VAL_IMAGES_DIR,
'val_labels_dir':VAL_LABELS_DIR,
'test_images_dir':TEST_IMAGES_DIR,
'test_labels_dir':TEST_LABELS_DIR,
'classes': CLASSES
}
This the code where we load the data in yolo-nas model
from super_gradients.training.dataloaders.dataloaders import (
coco_detection_yolo_format_train, coco_detection_yolo_format_val)
train_data = coco_detection_yolo_format_train(
dataset_params={
'data_dir': dataset_params['data_dir'],
'images_dir': dataset_params['train_images_dir'],
'labels_dir': dataset_params['train_labels_dir'],
'classes': dataset_params['classes']
},
dataloader_params={
'batch_size': BATCH_SIZE,
'num_workers': 2
}
)
val_data = coco_detection_yolo_format_val(
dataset_params={
'data_dir': dataset_params['data_dir'],
'images_dir': dataset_params['val_images_dir'],
'labels_dir': dataset_params['val_labels_dir'],
'classes': dataset_params['classes']
},
dataloader_params={
'batch_size': BATCH_SIZE,
'num_workers': 2
}
)
test_data = coco_detection_yolo_format_val(
dataset_params={
'data_dir': dataset_params['data_dir'],
'images_dir': dataset_params['test_images_dir'],
'labels_dir': dataset_params['test_labels_dir'],
'classes': dataset_params['classes']
},
dataloader_params={
'batch_size': BATCH_SIZE,
'num_workers': 2
}
)
When I am running the code, I am facing the Not found error
Error: -
RuntimeError:
data_dir=azureml://subscriptions/xxxxxxxx/resourcegroups/Faucet-
poc/workspaces/faucet-ob-poc/datastores/faucetpoc/paths/ not found.
Please make sure that data_dir points toward your dataset.
I am adding the snippet here for the mount point in azure-ml
答案1
得分: 1
我已根据情景复制了该问题。
培训数据已上传到数据存储中。
要在笔记本中访问数据,您可以挂载数据存储。
按照以下步骤挂载数据存储:
一旦您按照上述步骤创建挂载点,您将拥有以下挂载路径:
/home/azureuser/cloudfiles/data/datastore/data/
然后您可以将其用作您的父目录 DATA_DIR
。
有了上述设置,我能够访问培训数据。
更新: 要使用Python挂载数据存储,您可以使用以下代码。
from azureml.core import Workspace, Dataset, Datastore
workspace = Workspace.from_config()
datastore = Datastore.get(workspace, "workspaceblobstore")
dataset = Dataset.File.from_files(path=(datastore, 'Data'))
mounted_path = "/tmp/test"
mount_cont = dataset.mount(mounted_path)
mount_cont.start()
然后您可以将 mounted_path
用作您的父目录。
注意: 请检查您的权限和分配的角色。这可能是您无法看到 data actions
和类似问题的原因,也可能是上述情况的原因之一。
英文:
I have reproduced the issue based on the scenario.
The training data is uploaded in the datastores.
To access the data in the notebook you can mount the datastore.
Follow the below steps to mount the datastore:
Once you create the mount with above steps, you will have mounting path as:
/home/azureuser/cloudfiles/data/datastore/data/
Then you can use this as your parent directory DATA_DIR
.
With the above setup I was to access the training data.
Update: To mount the datastores with python you can use the below code.
from azureml.core import Workspace, Dataset, Datastore
workspace = Workspace.from_config()
datastore = Datastore.get(workspace, "workspaceblobstore")
dataset = Dataset.File.from_files(path=(datastore, 'Data'))
mounted_path = "/tmp/test"
mount_cont = dataset.mount(mounted_path)
mount_cont.start()
Then you can use the mounted_path
as your parent directory.
Note: Please check your permissions and assigned role. That can be a reason why you are not able to see data actions
and similar issue can also be for the above case.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论