英文:
Increase CPU to GPU ratio quota on Google Cloud VM
问题
在设置Google Cloud虚拟机实例时,虚拟机的GPU数量与CPU数量之间有固定的比例。这些比例在这里有详细描述。例如,拥有1个NVIDIA T4 GPU的机器最多可以拥有48个虚拟CPU(vCPUs)。然而,拥有4个T4 GPU的机器最多可以拥有96个CPU。
我想知道是否有超越这些比例的方法。例如,我想运行一个拥有64个vCPUs和1个T4 GPU的机器。是否可以通过请求增加配额来实现这一点?或者这是无法修改的物理限制?
为了提供背景,我正在使用一个具有许多输入层的深度学习模型进行预测。我正在实时预处理输入数据(即加载、组合和标准化)以进行预测。这意味着我的任务更多地受到CPU的限制而不是GPU的限制,因此我希望增加CPU的数量而不必支付更多的GPU,因为我不会使用额外的GPU容量。此外,由于我只需要对每个数据点进行一次预测,因此似乎不值得在预处理所有数据并将预处理版本保存到磁盘以便在预测时加载(请注意,这是我在训练时所做的)的存储和计算时间。
英文:
When setting up a Google Cloud VM instance, there are fixed ratios of number of GPUs to number of CPUs for the VM. These are described here. For example, a machine with 1 NVIDIA T4 GPU can have at most 48 vCPUs. However, a machine with 4 T4 GPUs can have up to 96 CPUs.
I'm wondering if there's any way to exceed these ratios. For example, I would like to run a machine with 64 vCPUs and 1 T4 GPU. Is it possible to do this by requesting a quota increase? Or is this a physical limitation that can't be modified?
For context, I am predicting using a deep learning model that has many input layers. I am pre-processing the inputs (i.e. loading, combining, and normalizing) on the fly just before prediction. This means my job is much more CPU-constrained than GPU-constrained, and I would like to increase the number of CPUs without paying for more GPUs, since I wouldn't be using the additional GPU capacity. Moreover, since I only need to predict once per point, it doesn't seem worth the storage and compute time to pre-process all the data and save pre-processed versions to disk so I can just load these at prediction time (note this is what I did for training).
答案1
得分: 1
不,不可能,因为这些是固定的物理限制。这些比率是由硬件确定的,并且设计成这样的方式,以确保一个部分不会成为另一个部分的瓶颈。总而言之,你无法超越你提到的固定比率。
英文:
> I'm wondering if there's any way to exceed these ratios
No, there isn't a way, because these are fixed physical limitations. These ratios are determined by the hardware (and designed in a way so that one part doesn't bottleneck another).
All in all, you cannot exceed the fixed ratios you mentioned.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论