问题

我正在部署一个深度学习模型，并将keras模型保存为*.h5文件。我认为复杂的模型会使文件大小变大，从而在服务器上导致交互变慢，但是否有除了减少模型中的层数之外的其他方法可以做到？是否有一种压缩.h5*文件以加快在服务器上加载的方式？

谢谢。

英文:

I'm deploying a deep learning model and saved the keras model as .h5 file. I think complex model will make it big in size and hence slow interaction at the server, but is there a way other than reducing the layers in the model that I can do? Is there a sort of compressing the .h5 file in order to load it faster for the server?

Thank you

答案1

得分: 1

有一种方法可以做到这一点。

你正在寻找的是称为“量化”的方法。

与减少层数等效于模型剪枝不同，量化通过修改权重的精度（在某些情况下甚至是激活函数）来减小模型的大小和延迟。

要获取更详细的信息，请阅读官方TensorFlow文档上的此页面：https://www.tensorflow.org/lite/performance/post_training_quantization

英文:

There is a way to do that.

What you are looking for is called quantization.

Not necessarily reducing the layers which is equivalent to model-pruning, quantization reduces both the size and the latency of the model by modifying the precision of the weights (or even activations in some cases).

For more detailed information, read this page on the official TensorFlow documentation: https://www.tensorflow.org/lite/performance/post_training_quantization

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

生产 – 什么是加载文件以进行快速计算的最佳方法？

问题

答案1

空白页在部署 Next.js 存储库时出现

无法打开prototxt文件。

在Macbook Pro 2019 GPU上，TensorFlow操作实现了最大1倍的加速。

YOLO模型未产生预期结果。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论