“RuntimeError: CustomJob resource has not been created” 在创建 Vertex AI CustomJob 时发生

huangapple go评论77阅读模式
英文:

"RuntimeError: CustomJob resource has not been created" when creating Vertex AI CustomJob

问题

我尝试创建一个类似于 https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomJob 示例的 Vertex AI CustomJob。

import time
from google.cloud import aiplatform

worker_pool_specs = [
    {
        "machine_spec": {
            "machine_type": "n1-standard-4",
        },
        "replica_count": 1,
        "container_spec": {
            "image_uri": "eu.gcr.io/somexistingimage",
            "command": ["python", "myscript.py", "test", "--var"],
            "args": [],
        },
    }
]

job = aiplatform.CustomJob(
    display_name="job_{}".format(round(time.time())),
    worker_pool_specs=worker_pool_specs,
    project="my-project",
    staging_bucket="gs://some-bucket",
)

现在,当我检查作业时,几乎所有字段(create_time、display_name、end_time等)都包含以下文本:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../lib/python3.9/site-packages/google/cloud/aiplatform/base.py", line 686, in display_name
    self._assert_gca_resource_is_available()
  File ".../lib/python3.9/site-packages/google/cloud/aiplatform/base.py", line 1332, in _assert_gca_resource_is_available
    raise RuntimeError(
RuntimeError: CustomJob resource has not been created.

环境:
python 3.9.16
google-cloud-aiplatform 1.28.1

我已登录,并且默认的应用程序身份验证已设置正确,因为我可以提交 CustomContainerTrainingJob,但无法提交 CustomJob

我找不到关于这个错误的任何信息。我该如何修复这个问题?

英文:

I try to create a Vertex AI CustomJob similar to the example from https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomJob

import time
from google.cloud import aiplatform

worker_pool_specs = [
    {
        &quot;machine_spec&quot;: {
            &quot;machine_type&quot;: &quot;n1-standard-4&quot;,
        },
        &quot;replica_count&quot;: 1,
        &quot;container_spec&quot;: {
            &quot;image_uri&quot;: &quot;eu.gcr.io/somexistingimage&quot;,
            &quot;command&quot;: [&quot;python&quot;, &quot;myscript.py&quot;, &quot;test&quot;, &quot;--var&quot;],
            &quot;args&quot;: [],
        },
    }
]

job = aiplatform.CustomJob(
    display_name=&quot;job_{}&quot;.format(round(time.time())),
    worker_pool_specs=worker_pool_specs,
    project=&quot;my-project&quot;,
    staging_bucket=&quot;gs://some-bucket&quot;,
)

Now when I inspect the job, practically all fields (create_time, display_name, end_time, ...) contain the following text:

Traceback (most recent call last):
  File &quot;&lt;string&gt;&quot;, line 1, in &lt;module&gt;
  File &quot;..../lib/python3.9/site-packages/google/cloud/aiplatform/base.py&quot;, line 686, in display_name
    self._assert_gca_resource_is_available()
  File &quot;..../lib/python3.9/site-packages/google/cloud/aiplatform/base.py&quot;, line 1332, in _assert_gca_resource_is_available
    raise RuntimeError(
RuntimeError: CustomJob resource has not been created.

Environment:
python 3.9.16
google-cloud-aiplatform 1.28.1

I'm logged in and default application auth is set correctly, as I can submit CustomContainerTrainingJobs. Just not CustomJobs.

I cannot find anything on this error. How can I fix this?

答案1

得分: 0

好的,解决方案非常简单:

在作业对象(例如job.display_name)在作业已运行或提交之前,不应读取属性。

如果执行job.submit(),然后可以在之后检查作业,或者如果运行job.run(sync=True)
如果运行job.run(sync=False),你会得到相同的错误,因为你永远不知道作业是否已完全初始化。

英文:

Ok, the solution is quite simple:

You must not read out the attributes of the Job object (e.g. job.display_name) before the job has been run or submitted.

If you execute job.submit() you can inspect the job afterwards, or if you run job.run(sync=True).
If you run job.run(sync=False) you get the same error, because you never know if the Job has been fully initialized.

huangapple
  • 本文由 发表于 2023年7月31日 21:14:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76804004.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定