在深度学习中,当批处理大小减小时,是否可以提高预测速度?

huangapple go评论65阅读模式
英文:

In deep learning, can the prediction speed increase as the batch size decreases?

问题

我们正在使用deepchem的GCNModel开发一个预测模型。

模型的学习和性能验证都进行得很顺利,但确实确认了在预测方面花费了很多时间。

我们尝试预测总共100万条数据,使用的参数如下。

model = GCNModel(n_tasks=1, mode='regression', number_atom_features=32, learning_rate=0.0001, dropout=0.2, batch_size=32, device=device, model_dir=model_path)

我改变了批处理大小以提高性能,确认了当值减少时时间更快。

所有模型都使用相同的GPU内存。

根据常识,我知道批处理大小越大,速度应该越快。但你能告诉我为什么它会相反吗?

如果您能告诉我们如何进一步提高预测时间,我们将不胜感激。

英文:

We are developing a prediction model using deepchem's GCNModel.

Model learning and performance verification proceeded without problems, but it was confirmed that a lot of time was spent on prediction.

We are trying to predict a total of 1 million data, and the parameters used are as follows.

model = GCNModel(n_tasks=1, mode='regression', number_atom_features=32, learning_rate=0.0001, dropout=0.2, batch_size=32, device=device, model_dir=model_path)

I changed the batch size to improve the performance, and it was confirmed that the time was faster when the value was decreased than when the value was increased.

All models had the same GPU memory usage.

From common sense I know, it is estimated that the larger the batch size, the faster it will be. But can you tell me why it works in reverse?

We would be grateful if you could also let us know how we can further improve the prediction time.

答案1

得分: 1

Epoch
Epoch(周期)
Times that your model and learning algorithm will walk through your dataset.
模型和学习算法遍历数据集的次数。
(Complete passes)
(完成遍历)

BatchSize
BatchSize(批次大小)
The number of samples (every single row of your training data) before updating the internal model. in other words, the number of samples processed before the model is updated.
在内部模型更新之前处理的样本数量,换句话说,模型更新之前处理的样本数量。

So Your batch size is something between 1 and your len(training_data)
因此,您的批次大小在1和您的len(training_data)之间。

Generally, more batch size gives more accuracy of training data.
通常,更大的批次大小可以提供更高的训练数据准确性。

Epoch ↑ Batch Size ↑ Accuracy ↑ Speed ↓
周期 ↑ 批次大小 ↑ 准确度 ↑ 速度 ↓

So the short answer to question is more batch size takes more memory and needs more process and obviously takes longer time to learn.
因此,对问题的简短答案是更大的批次大小需要更多内存和更多的处理,显然需要更长时间来学习。

Here is the link for more details.
这是获取更多详细信息的链接。

英文:

let's clarify some definitions first.

Epoch
Times that your model and learning algorithm will walk through your dataset.
(Complete passes)

BatchSize
The number of samples(every single row of your training data) before updating the internal model. in other words, the number of samples processed before the model is updated.

So Your batch size is something between 1 and your len(training_data)

Generally, more batch size gives more accuracy of training data.

Epoch ↑ Batch Size ↑ Accuracy ↑ Speed ↓

So the short answer to question is more batch size takes more memory and needs more process and obviously takes longer time to learn.

https://stats.stackexchange.com/questions/153531/what-is-batch-size-in-neural-network

Here is the link for more details.

答案2

得分: 0

有关速度有两个组成部分:

  • 您的批量大小 模型大小
  • 您的 CPU/GPU 功率在 生成和处理批次

其中两者需要保持平衡。例如,如果您的模型完成了此批次的预测,但下一个批次尚未生成,您将注意到 GPU 利用率在短暂时刻内下降。不幸的是,没有直接告诉您这种平衡的内部度量标准 - 尝试使用 time.time() 来对您的模型的预测以及数据加载速度进行基准测试。

然而,我认为这并不值得努力,因此您可以继续减小批量大小,直到没有改善为止 - 这就是停止的地方。

英文:

There are two components regarding the speed:

  • Your batch size and model size
  • Your CPU/GPU power in spawning and processing batches

And two of them need to be balanced. For example, if your model finishes prediction of this batch, but the next batch is not yet spawned, you will notice a drop in GPU utilization for a brief moment. Sadly there is no inner metrics that directly tell you this balance - try using time.time() to benchmark your model's prediction as well as the dataloader speed.

However, I don't think that's worth the effort, so you can keep decreasing the batch size up to the point there is no improvement - that's where to stop.

huangapple
  • 本文由 发表于 2023年2月8日 11:21:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75381096.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定