英文:
ONNX performance compared to sklearn
问题
我已将一个sklearn logistic回归模型对象转换为一个ONNX模型对象,并注意到ONNX评分比sklearn.predict()方法需要更长的时间。我觉得我一定做错了什么,因为ONNX被宣传为一个优化的预测解决方案。我注意到在较大的数据集上差异更加明显,所以我创建了一个名为X_large_dataset的代理。
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import datetime
from sklearn.linear_model import LogisticRegression
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
import numpy as np
import onnxruntime as rt
# 创建训练数据
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
# 将模型拟合到逻辑回归
clr = LogisticRegression()
clr.fit(X_train, y_train)
# 转换为ONNX格式
initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
f.write(onx.SerializeToString())
# 从ONNX对象创建推理会话
sess = rt.InferenceSession(
"logreg_iris.onnx", providers=rt.get_available_providers())
input_name = sess.get_inputs()[0].name
# 创建一个较大的数据集作为大批量处理的代理
X_large_dataset = np.array([[1, 2, 3, 4]]*10_000_000)
start = datetime.datetime.now()
pred_onx = sess.run(None, {input_name: X_large_dataset.astype(np.float32)})[0]
end = datetime.datetime.now()
print("ONNX评分时间:", end - start)
# 与直接使用模型对象进行评分进行比较
start = datetime.datetime.now()
pred_sk = clr.predict(X_large_dataset)
end = datetime.datetime.now()
print("sklearn评分时间:", end - start)
在我的机器上,这段代码显示sklearn predict在不到一秒内运行完毕,而ONNX则需要18秒。
英文:
I have converted a sklearn logistic regression model object to an ONNX model object and noticed that ONNX scoring takes significantly longer to score compared to the sklearn.predict() method. I feel like I must be doing something wrong b/c ONNX is billed as an optimized prediction solution. I notice that the difference is more noticeable with larger data sets so I created X_large_dataset as as proxy.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import datetime
from sklearn.linear_model import LogisticRegression
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
import numpy as np
import onnxruntime as rt
# create training data
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
# fit model to logistic regression
clr = LogisticRegression()
clr.fit(X_train, y_train)
# convert to onnx format
initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
f.write(onx.SerializeToString())
# create inference session from onnx object
sess = rt.InferenceSession(
"logreg_iris.onnx", providers=rt.get_available_providers())
input_name = sess.get_inputs()[0].name
# create a larger dataset as a proxy for large batch processing
X_large_dataset = np.array([[1, 2, 3, 4]]*10_000_000)
start = datetime.datetime.now()
pred_onx = sess.run(None, {input_name: X_large_dataset.astype(np.float32)})[0]
end = datetime.datetime.now()
print("onnx scoring time:", end - start)
# compare to scoring directly with model object
start = datetime.datetime.now()
pred_sk = clr.predict(X_large_dataset)
end = datetime.datetime.now()
print("sklearn scoring time:", end - start)
This code snippet on my machine shows that sklearn predict runs in less than a second and ONNX runs in 18 seconds.
答案1
得分: 2
只将模型转换为ONNX并不意味着它会自动具有更好的性能。在转换过程中,ONNX尝试优化计算图,例如通过删除不对输出产生贡献的计算,或通过将单独的层合并为单一运算符来实现。对于由卷积、归一化和非线性层组成的通用神经网络,这些优化通常会导致更高的吞吐量和更好的性能。
所以考虑到您只导出了LogisticRegression
,很可能sklearn
和相应的onnx
实现已经非常优化,转换不会带来任何性能提升。
至于为什么InferenceSession.run
比sklearn.predict
慢20倍:
X_large_dataset
是一个大小超过300 MB的np.int64
数组。在run
内部创建输入字典时,使用astype
进行类型转换会创建一个新的150 MB数组,将所有内容复制到其中。显然,这不应计入模型执行时间。- 使用动态输入执行模型时,
onnxruntime
具有相当多的内存管理开销,特别是第一次执行时。对于相同形状的输入进行的后续run
调用应该会快得多。
英文:
Simply converting a model to ONNX does not mean that it will automatically have a better performance. During conversion, ONNX tries to optimize the computational graph for example by removing calculations which do not contribute to the output, or by fusing separate layers into a single operator. For a generic neural network consisting of convolution, normalization and nonlinearity layers, these optimizations often result in a higher throughput and better performance.
So considering you are exporting just LogisticRegression
, most likely both sklearn
and the corresponding onnx
implementations are already very optimized and the conversion will not lead to any performance gain.
As to why the InferenceSession.run
is 20x slower than sklearn.predict
X_large_dataset
is anp.int64
array over 300 MB in size. Casting it withastype
when creating the input dictionary inside ofrun
creates a new 150 MB array to which everything is copied. This obviously shouldn't be counted towards the model execution time.onnxruntime
has quite a bit of memory management overhead when executing models with dynamic inputs for the first time. Subsequent calls torun
with inputs of the same shape should finish a lot faster.
答案2
得分: 1
onnxruntime在大多数情况下更快。对于这种特殊情况有两个解释:
- 您需要删除zipmap运算符,它是无用的并且没有理由占用时间(请参阅http://onnx.ai/sklearn-onnx/auto_tutorial/plot_dbegin_options_zipmap.html)
- 在您的情况下,sklearn不返回概率,它使用原始分数来返回标签,不需要计算logit函数,而onnxruntime始终计算概率。您应该与predict_proba进行比较(还请参阅http://onnx.ai/sklearn-onnx/auto_tutorial/plot_dbegin_options.html)。
英文:
onnxruntime is faster in most of the cases. Two explanations on this particular case:
- you need to remove the zipmap operator, it is useless and takes time for no reason (see http://onnx.ai/sklearn-onnx/auto_tutorial/plot_dbegin_options_zipmap.html)
- in your case, sklearn does not return the probabilities, it uses raw scores to return the label and does not need to compute the logit function, onnxruntime always computes the probabilities. You should compare to predict_proba (see also http://onnx.ai/sklearn-onnx/auto_tutorial/plot_dbegin_options.html).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论