英文:
How to create job id for Vertex AI manually or how to access job id in custom container?
问题
有可能在 Vertex AI 上启动自定义作业时手动指定“自定义作业 ID” 吗?
如果不行,是否可以在自定义容器内部以某种方式访问自动创建的作业 ID?
英文:
Is there the possibility to manually specify the "Custom job ID" when starting a custom job on Vertex AI?
If not, is it possible to somehow access the automatically created job id from inside the custom container?
答案1
得分: 1
这是一个较晚的回复,但我会将我的答案提供给未来的读者。
我正在将GCP“AI平台”中的“自定义作业”迁移到“Vertex AI”。以前,AI平台能够在提交自定义作业时指定“作业ID”,如下例所示。然而,在Vertex AI中,无法指定自定义作业ID。
AI平台 vs. Vertex AI (bash)
# AI平台
gcloud ai-platform jobs submit training ${JOB_ID} \
--project=${PROJECT_ID} \
--module-name trainer.task --package-path ./trainer \
--region us-central1 --python-version 3.7 --runtime-version 2.11 \
-- \
--param1=${param1} \
--param2=${param2} \
--param3=${param3}
# Vertex AI
gcloud ai custom-jobs create \
--display-name=${DISPLAY_NAME} \ # 无法指定JOB_ID
--region us-central1 \
--project=${PROJECT_ID} \
--python-package-uris='gs://[MODULE_PATH]/[MODULE_NAME]-0.1.tar.gz' \
--worker-pool-spec=machine-type=e2-standard-4,replica-count=1,executor-image-uri='us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-7:latest',python-module=trainer.task \
--args=param1=${param1} \
--args=param2=${param2} \
--args=param3=${param3}
检索“job_id”的一种可能解决方案是使用“display_name”筛选检索作业信息。
以下是使用ListCustomJobsRequest()
和JobServiceClient()
的Python3示例。
使用“display_name”标签筛选检索Vertex AI“Job ID”(Python)
from google.cloud import aiplatform_v1 # 扩展
client = aiplatform_v1.JobServiceClient(
credentials=credentials,
client_options={"api_endpoint": "us-central1-aiplatform.googleapis.com"}
)
request = aiplatform_v1.ListCustomJobsRequest(
parent=f"projects/{PROJECT_ID}/locations/{LOCATION_ID}",
filter="display_name=[DISPLAY_NAME]",
)
page_result = client.list_custom_jobs(request=request)
for response in page_result:
print(response)
print(response.name) # → "projects/[PROJECT_ID]/locations/[LOCATION_ID]/customJobs/[JOB_ID]"
print(response.name.split('/')[5])
print(response.state) # → JOB_STATE_SUCCEEDED/JOB_STATE_PENDING
参考
gcloud ai custom-jobs create
gcloud ai-platform jobs submit training
list_custom_jobs()
ListCustomJobsRequest()
英文:
This is a late reply, but I'll put my answer to this question for future readers.
I was porting the custom jobs
in the GCP AI platform
to Vertex AI
. Previously, AI platform was able to specify the job id
when submitting the custom job like the below example. However, in vertex AI, the custom job id cannot be specified.
AI platform vs. Vertex AI (bash)
# AI platform
gcloud ai-platform jobs submit training ${JOB_ID} \
--project=${PROJECT_ID} \
--module-name trainer.task --package-path ./trainer \
--region us-central1 --python-version 3.7 --runtime-version 2.11 \
-- \
--param1=${param1} \
--param2=${param2} \
--param3=${param3}
# Vertex AI
gcloud ai custom-jobs create \
--display-name=${DISPLAY_NAME} \ # cannot specify JOB_ID
--region us-central1 \
--project=${PROJECT_ID} \
--python-package-uris='gs://[MODULE_PATH]/[MODULE_NAME]-0.1.tar.gz' \
--worker-pool-spec=machine-type=e2-standard-4,replica-count=1,executor-image-uri='us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-7:latest',python-module=trainer.task \
--args=param1=${param1} \
--args=param2=${param2} \
--args=param3=${param3}
One possible solution to retrieve the job_id
is to retrieve the job information with display_name
filters.
Below is the python3 example of using ListCustomJobsRequest()
and JobServiceClient()
.
Retrieve Vertex AI Job ID
with filtering display_name
tag (Python)
from google.cloud import aiplatform_v1 # extension
client = aiplatform_v1.JobServiceClient(
credentials=credentials,
client_options={"api_endpoint": "us-central1-aiplatform.googleapis.com"}
)
request = aiplatform_v1.ListCustomJobsRequest(
parent=f"projects/{PROJECT_ID}/locations/{LOCATION_ID}",
filter="display_name=[DISPLAY_NAME]",
)
page_result = client.list_custom_jobs(request=request)
for response in page_result:
print(response)
print(response.name) # → "projects/[PROJECT_ID]/locations/[LOCATION_ID]/customJobs/[JOB_ID]"
print(response.name.split('/')[5])
print(response.state) # → JOB_STATE_SUCCEEDED/JOB_STATE_PENDING
References
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论