问题

在设置Google Cloud虚拟机实例时，虚拟机的GPU数量与CPU数量之间有固定的比例。这些比例在这里有详细描述。例如，拥有1个NVIDIA T4 GPU的机器最多可以拥有48个虚拟CPU（vCPUs）。然而，拥有4个T4 GPU的机器最多可以拥有96个CPU。

我想知道是否有超越这些比例的方法。例如，我想运行一个拥有64个vCPUs和1个T4 GPU的机器。是否可以通过请求增加配额来实现这一点？或者这是无法修改的物理限制？

为了提供背景，我正在使用一个具有许多输入层的深度学习模型进行预测。我正在实时预处理输入数据（即加载、组合和标准化）以进行预测。这意味着我的任务更多地受到CPU的限制而不是GPU的限制，因此我希望增加CPU的数量而不必支付更多的GPU，因为我不会使用额外的GPU容量。此外，由于我只需要对每个数据点进行一次预测，因此似乎不值得在预处理所有数据并将预处理版本保存到磁盘以便在预测时加载（请注意，这是我在训练时所做的）的存储和计算时间。

英文:

When setting up a Google Cloud VM instance, there are fixed ratios of number of GPUs to number of CPUs for the VM. These are described here. For example, a machine with 1 NVIDIA T4 GPU can have at most 48 vCPUs. However, a machine with 4 T4 GPUs can have up to 96 CPUs.

I'm wondering if there's any way to exceed these ratios. For example, I would like to run a machine with 64 vCPUs and 1 T4 GPU. Is it possible to do this by requesting a quota increase? Or is this a physical limitation that can't be modified?

For context, I am predicting using a deep learning model that has many input layers. I am pre-processing the inputs (i.e. loading, combining, and normalizing) on the fly just before prediction. This means my job is much more CPU-constrained than GPU-constrained, and I would like to increase the number of CPUs without paying for more GPUs, since I wouldn't be using the additional GPU capacity. Moreover, since I only need to predict once per point, it doesn't seem worth the storage and compute time to pre-process all the data and save pre-processed versions to disk so I can just load these at prediction time (note this is what I did for training).

答案1

得分: 1

不，不可能，因为这些是固定的物理限制。这些比率是由硬件确定的，并且设计成这样的方式，以确保一个部分不会成为另一个部分的瓶颈。总而言之，你无法超越你提到的固定比率。

英文:

> I'm wondering if there's any way to exceed these ratios

No, there isn't a way, because these are fixed physical limitations. These ratios are determined by the hardware (and designed in a way so that one part doesn't bottleneck another).

All in all, you cannot exceed the fixed ratios you mentioned.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

增加Google云虚拟机上的CPU到GPU配额比例。

问题

答案1

Running a script after creation of instance in GCP Managed Instance Group

Service Account无法访问Google Drive，尽管拥有正确的访问权限。

Getting `Request had invalid authentication credentials` when using service account json key file in Go app

JSON response format when modifying user during beforeCreate blocking function (Google Identity Platform)

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论