问题

以下是代码部分的翻译：

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
cluster = LocalCUDACluster()
client = Client(cluster)

请注意，此错误可能与特定的库版本或环境设置有关，我无法提供直接的解决方案。建议检查您的环境和库版本，确保它们与Dask和RapidsAI的要求兼容。如果问题仍然存在，您可以在相关的支持论坛或社区中寻求帮助，以获取更多专业的支持。

英文:

I am new to Dask and I run into problems when executing the example code:

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
cluster = LocalCUDACluster()
client = Client(cluster)

I would get the following error:

AttributeError                            Traceback (most recent call last)
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/deploy/spec.py:319, in SpecCluster._start(self)
318     cls = import_term(cls)
--&gt; 319 self.scheduler = cls(**self.scheduler_spec.get(&quot;options&quot;, {}))
320 self.scheduler = await self.scheduler
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/scheduler.py:3481, in Scheduler.__init__(self, loop, delete_interval, synchronize_worker_interval, services, service_kwargs, allowed_failures, extensions, validate, scheduler_file, security, worker_ttl, idle_timeout, interface, host, port, protocol, dashboard_address, dashboard, http_prefix, preload, preload_argv, plugins, contact_address, transition_counter_max, jupyter, **kwargs)
3480 if show_dashboard:
-&gt; 3481     distributed.dashboard.scheduler.connect(
3482         self.http_application, self.http_server, self, prefix=http_prefix
3483     )
3484 self.jupyter = jupyter
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/dashboard/scheduler.py:158, in connect(application, http_server, scheduler, prefix)
156 def connect(application, http_server, scheduler, prefix=&quot;&quot;):
157     bokeh_app = BokehApplication(
--&gt; 158         applications, scheduler, prefix=prefix, template_variables=template_variables()
159     )
160     application.add_application(bokeh_app)
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/cytoolz/functoolz.pyx:475, in cytoolz.functoolz._memoize.__call__()
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/dashboard/scheduler.py:131, in template_variables()
123 from distributed.diagnostics.nvml import device_get_count
125 template_variables = {
126     &quot;pages&quot;: [
127         &quot;status&quot;,
128         &quot;workers&quot;,
129         &quot;tasks&quot;,
130         &quot;system&quot;,
--&gt; 131         *([&quot;gpu&quot;] if device_get_count() &gt; 0 else []),
132         &quot;profile&quot;,
133         &quot;graph&quot;,
134         &quot;groups&quot;,
135         &quot;info&quot;,
136     ],
137     &quot;plots&quot;: [
138         {
139             &quot;url&quot;: x.strip(&quot;/&quot;),
140             &quot;name&quot;: &quot; &quot;.join(x.strip(&quot;/&quot;).split(&quot;-&quot;)[1:])
141             .title()
142             .replace(&quot;Cpu&quot;, &quot;CPU&quot;)
143             .replace(&quot;Gpu&quot;, &quot;GPU&quot;),
144         }
145         for x in applications
146         if &quot;individual&quot; in x
147     ]
148     + [{&quot;url&quot;: &quot;hardware&quot;, &quot;name&quot;: &quot;Hardware&quot;}],
149 }
150 template_variables[&quot;plots&quot;] = sorted(
151     template_variables[&quot;plots&quot;], key=lambda d: d[&quot;name&quot;]
152 )
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/diagnostics/nvml.py:126, in device_get_count()
125 def device_get_count():
--&gt; 126     init_once()
127     if not is_initialized():
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/diagnostics/nvml.py:108, in init_once()
105     return
107 if _in_wsl() and parse_version(
--&gt; 108     pynvml.nvmlSystemGetDriverVersion().decode()
109 ) &lt; parse_version(MINIMUM_WSL_VERSION):
110     NVML_STATE = NVMLState.DISABLED_WSL_INSUFFICIENT_DRIVER
AttributeError: &#39;str&#39; object has no attribute &#39;decode&#39;
The above exception was the direct cause of the following exception:
RuntimeError                              Traceback (most recent call last)
Cell In[22], line 3
1 from dask_cuda import LocalCUDACluster
----&gt; 3 cluster = LocalCUDACluster()
4 client = Client(cluster)
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/dask_cuda/local_cuda_cluster.py:336, in LocalCUDACluster.__init__(self, CUDA_VISIBLE_DEVICES, n_workers, threads_per_worker, memory_limit, device_memory_limit, data, local_directory, shared_filesystem, protocol, enable_tcp_over_ucx, enable_infiniband, enable_nvlink, enable_rdmacm, rmm_pool_size, rmm_maximum_pool_size, rmm_managed_memory, rmm_async, rmm_log_directory, rmm_track_allocations, jit_unspill, log_spilling, worker_class, pre_import, **kwargs)
329     worker_class = partial(
330         LoggedNanny if log_spilling is True else Nanny,
331         worker_class=worker_class,
332     )
334 self.pre_import = pre_import
--&gt; 336 super().__init__(
337     n_workers=0,
338     threads_per_worker=threads_per_worker,
339     memory_limit=self.memory_limit,
340     processes=True,
341     data=data,
342     local_directory=local_directory,
343     protocol=protocol,
344     worker_class=worker_class,
345     config={
346         &quot;distributed.comm.ucx&quot;: get_ucx_config(
347             enable_tcp_over_ucx=enable_tcp_over_ucx,
348             enable_nvlink=enable_nvlink,
349             enable_infiniband=enable_infiniband,
350             enable_rdmacm=enable_rdmacm,
351         )
352     },
353     **kwargs,
354 )
356 self.new_spec[&quot;options&quot;][&quot;preload&quot;] = self.new_spec[&quot;options&quot;].get(
357     &quot;preload&quot;, []
358 ) + [&quot;dask_cuda.initialize&quot;]
359 self.new_spec[&quot;options&quot;][&quot;preload_argv&quot;] = self.new_spec[&quot;options&quot;].get(
360     &quot;preload_argv&quot;, []
361 ) + [&quot;--create-cuda-context&quot;]
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/deploy/local.py:253, in LocalCluster.__init__(self, name, n_workers, threads_per_worker, processes, loop, start, host, ip, scheduler_port, silence_logs, dashboard_address, worker_dashboard_address, diagnostics_port, services, worker_services, service_kwargs, asynchronous, security, protocol, blocked_handlers, interface, worker_class, scheduler_kwargs, scheduler_sync_interval, **worker_kwargs)
250 worker = {&quot;cls&quot;: worker_class, &quot;options&quot;: worker_kwargs}
251 workers = {i: worker for i in range(n_workers)}
--&gt; 253 super().__init__(
254     name=name,
255     scheduler=scheduler,
256     workers=workers,
257     worker=worker,
258     loop=loop,
259     asynchronous=asynchronous,
260     silence_logs=silence_logs,
261     security=security,
262     scheduler_sync_interval=scheduler_sync_interval,
263 )
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/deploy/spec.py:286, in SpecCluster.__init__(self, workers, scheduler, worker, asynchronous, loop, security, silence_logs, name, shutdown_on_close, scheduler_sync_interval)
284 if not called_from_running_loop:
285     self._loop_runner.start()
--&gt; 286     self.sync(self._start)
287     try:
288         self.sync(self._correct_state)
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/utils.py:338, in SyncMethodMixin.sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
336     return future
337 else:
--&gt; 338     return sync(
339         self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
340     )
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/utils.py:405, in sync(loop, func, callback_timeout, *args, **kwargs)
403 if error:
404     typ, exc, tb = error
--&gt; 405     raise exc.with_traceback(tb)
406 else:
407     return result
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/utils.py:378, in sync.&lt;locals&gt;.f()
376         future = asyncio.wait_for(future, callback_timeout)
377     future = asyncio.ensure_future(future)
--&gt; 378     result = yield future
379 except Exception:
380     error = sys.exc_info()
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/tornado/gen.py:769, in Runner.run(self)
766 exc_info = None
768 try:
--&gt; 769     value = future.result()
770 except Exception:
771     exc_info = sys.exc_info()
File ~/miniconda3/envs/rapids-23.04/lib/python3.10/site-packages/distributed/deploy/spec.py:330, in SpecCluster._start(self)
328 self.status = Status.failed
329 await self._close()
--&gt; 330 raise RuntimeError(f&quot;Cluster failed to start: {e}&quot;) from e
RuntimeError: Cluster failed to start: &#39;str&#39; object has no attribute &#39;decode&#39;

The Dask version I have is Dask Version: 2023.2.0.

I tried to reinstall rapidsai, downgrade my python version from 3.10 to 3.8. and I also tried different parameters for LocalCUDACluster(), but none of these worked.

答案1

得分: 1

pyvnml中发生了一个意外的重大变更，影响了dask-cuda。Dask-cuda已发布了一个热修复版本（23.02.01）来解决这个问题。

我看到您正在使用夜间版本的软件包。在夜间版本中，这个问题应该已经通过此PR得到解决。在以下环境中，我无法重现您的问题：mamba create -n rapids-23.04 -c rapidsai-nightly -c nvidia -c conda-forge rapids=23.04 python=3.8 cudatoolkit=11.5 jupyterlab strings_udf。

如果您在一个全新的环境中仍然遇到这个问题，请提交一个dask-cuda Github问题。

英文:

There was an unexpected breaking change in pyvnml that impacted dask-cuda. Dask-cuda has issued a hotfix release (23.02.01) to solve this in the stable release.

I see you're using the nightly packages. In the nightly packages, this should have been resolved by this PR. I'm not able to reproduce your issue in the following environment: mamba create -n rapids-23.04 -c rapidsai-nightly -c nvidia -c conda-forge rapids=23.04 python=3.8 cudatoolkit=11.5 jupyterlab strings_udf.

If you still experience this problem in a fresh environment, please file a dask-cuda Github issue.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

RuntimeError: 使用dask LocalCudaCluster示例设置时，集群无法启动

问题

答案1

SQLAlchemy 2.0与外键和关联的问题

AttributeError: module 'os' has no attribute 'add_dll_directory'

将制表符分隔的字符串拆分成不同的列。

熊猫的合并在内存中出现了问题。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论