2023年5月25日 01:53:44go评论182阅读模式

英文:

Real-ESRGAN performace on Nvidia Tesla T4

问题

我正在运行一个带有2个NVidia Tesla T4的Google Cloud Platform (GCP)计算引擎，同时运行2个单独的图像放大4倍的线程。

我有以下的shell脚本：

python ~/Real-ESRGAN/inference_realesrgan.py -i ~/lowres/ -o ~/upscaled/ -dn 0 -t 512 -g 0

我的图像增强器是基于这个存储库构建的。

性能很好，但图像队列等待时间很长（它们是动态上传的）。

是否有什么方法可以提高处理速度？

我已经添加了-t 512（瓷砖 = 512）参数，因为Nvidia Tesla无法一次放大整个2048x2048像素的图像，会出现内存不足的错误。

我能否在脚本中进行一些调整？我无法停止整个图像增强器并进行时间测量实验，但总体上，减小瓷砖大小是否有帮助？或者是否有其他参数可以加速图像增强？

我有2个GPU，它们使用2个不同的文件夹，通过使用2个单独的shell脚本。我的最初目标是使用两个GPU对一个图像进行放大，但这并不起作用，尽管参数的帮助文档表示可以使用。

因此，我的主要问题是，增加或减小瓷砖大小是否会提高速度，另一个问题是，是否有一些额外的参数可以加快处理速度。

提前感谢您的回答！

英文:

I am running a Google Cloud Platform (GCP) Computing engine with 2x NVidia Tesla T4 , running 2 separate threads of image upscaling 4 times.

I have the following shell script for that:

python ~/Real-ESRGAN/inference_realesrgan.py -i ~/lowres/ -o ~/upscaled/  -dn 0 -t 512 -g 0

My upscaler is built based on this repository.

The performance is good, but I have a long waiting queue for the images (they are being dynamically uploaded there.

Is there something that could improve the proessing speed?

I've added the -t 512 (tiles = 512) parameter, as the Nvidia Tesla cannot upscale the whole image 2048x2048px at once, it gets Out of memory error.

Can I tweak somehow my script? I cannot stop the whole upscaler and experiment with the time measurement, but in global, lowering the tile size will help? Or is there another parameter that could speed up the upscaling?

I have 2 GPUs, they are using two different folders by utilizing 2 separate shell scripts. My original goal was to do an upscaling by using both of GPUs for one image, but it is not working, however the help for the arguments telling that it can be used.

So my main question is, that increasing or decreasing the tile size would improve the speed, and the additional question is, if there is some additional parameter which could speed up the processing.

Thanks for your answers in advance!

答案1

得分: 0

"<p style="margin: 0;">您可能会发现这份关于<a href="https://cloud.google.com/compute/docs/gpus/optimize-gpus">优化GPU的文档有所帮助。文档指出，要进行优化，您可以在使用NVIDIA A100、T4、L4或V100 GPU的VM上使用更高的网络带宽速度。关于您提到的增加或减少瓷砖大小是否会提高速度的问题，这取决于图像大小和GPU内存。如果您有足够的GPU内存，可以增加瓷砖大小以减少瓷砖数量并加快处理速度。否则，您可以减小瓷砖大小以适应GPU内存并避免内存不足错误。

英文:

<p style="margin: 0;">You might find this documentation on <a href="https://cloud.google.com/compute/docs/gpus/optimize-gpus">optimizing gpus</a>  helpful. It states to optimize, you can use higher network bandwidth speeds on VMs that use NVIDIA A100, T4, L4, or V100 GPUs. Regarding your question about increasing or decreasing the tile size if it would improve the speed, it depends on the image size and the GPU memory. If you have enough GPU memory, you can increase the tile size to reduce the number of tiles and speed up the processing. Otherwise, you can decrease the tile size to fit into GPU memory and avoid out-of-memory errors.</p>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Real-ESRGAN在Nvidia Tesla T4上的性能表现

问题

答案1

GAE Flex Golang开发环境

Google Cloud Tasks – 如何保护HTTP POST请求？

如何获取GPT中任何词汇标记的嵌入？

重命名虚拟机、IP和磁盘名称

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论