英文:
Finding the optimal combination of inputs which return maximal output for a black box model
问题
我在应用人工神经网络(ANN)进行回归任务时所面临的挑战之一是为了找到给定输入范围的最佳输出,我必须向模型提供一个多维网格(meshgrid),然后简单地选择最高值。然而,这总体上是一种计算成本非常高昂的解决方案。下面的文本可能会显得有点吓人,但这只是我为了更好地解释它而做的尝试。
让我用其他话来解释。假设我有9个输入用于我的ANN,然后我想检查哪些特征值的组合会返回最高的结果。我目前通过创建一个9D网格来解决这个问题,然后简单地对每个样本进行预测,然后识别出最佳行。然而,这需要大量耗费时间来工作。因此,我正在寻找一种更有效地达到这个最佳输出值的方法,如果可能的话。
在代码中,它看起来像这样:(只是一个简单而虚构的示例,实际上不太现实的Python代码):
import numpy as np
from itertools import product
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow.keras
from keras.models import Sequential
from keras.layers import Dense
import pandas as pd
import math
x1 = np.linspace(0, 20, 6)
x2 = np.linspace(0, 20, 6)
X = pd.DataFrame((product(*[x1, x2])))
y1 = 5 * np.cos(np.deg2rad(X[0]))
y2 = 5 - 1 * np.exp((-X[0]**2 / np.deg2rad(10)**2) * np.cos(np.deg2rad(X[1])))
y = np.array([y1 + y2]).T
设置一个黑盒模型,在这种情况下是神经网络:
x_scaler = MinMaxScaler()
y_scaler = MinMaxScaler()
X_scaled = x_scaler.fit_transform(X)
y_scaled = y_scaler.fit_transform(y)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.2, random_state=0)
model = Sequential()
model.add(Dense(100, input_dim=2, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='relu'))
model.compile(optimizer='Adam', loss='mean_squared_error')
epochs_hist = model.fit(X_train, y_train, epochs=100, batch_size=50, validation_split=0.2)
现在,我已经拟合了我的模型,我将使用网格在多个区间内寻找指定范围内的最佳值:
x1_pred = np.linspace(0, 20, 21)
x2_pred = np.linspace(0, 20, 21)
X_pred = pd.DataFrame((product(*[x1_pred, x2_pred])))
X_pred_test = x_scaler.fit_transform(X_pred)
y_pred = model.predict(X_pred_test)
y_pred = y_scaler.inverse_transform(y_pred)
因此,假设我类似地为达到最佳值而采取了类似的操作,但这次有9个输入,那么计算将会变得非常不可行。因此,我提出了如何找到最大输出的最佳输入组合的问题,这样的模型是黑盒模型,比如ANN。
英文:
One of the challenges that I have been facing when applying ANN to regression tasks on my job is that in order to find the optimal out for a given range of inputs, I have to feed a multidimensional meshgrid to my model and then simply pick the highest value. However, this is overall a very computationally costly solution. The lenght of the text bellow might be scary but it just my attempt to better explain it.
Let me explain with other words. Supposing that I have 9 inputs for my ANN, and then I want to check which combinations of values of my features that returns me the highest outcome. I am currently overcoming the problem by just creating a 9D-mesh and simply predict the value for each sample and then identifying the optimal row. Nevertheless, this takes an exhaustive amount of time to work. Therefore, I am looking for a way be able to more efficiently reach this optimal output value, if possible at all.
In code, it would look something like this: (just a simple and made up example not really realistic in python):
import numpy as np
from itertools import product
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow.keras
from keras.models import Sequential
from keras.layers import Dense
import pandas as pd
import math
x1 = np.linspace(0,20,6)
x2 = np.linspace(0,20,6)
X = pd.DataFrame((product(*[x1,x2])))
y1 = 5*np.cos(np.deg2rad(X[0]))
y2 = 5 - 1*np.exp((-X[0]**2/np.deg2rad(10)**2)*np.cos(np.deg2rad(X[1])))
y = np.array([y1 + y2]).T
Setting a blackbox model, in this case, a neural network
x_scaler = MinMaxScaler()
y_scaler = MinMaxScaler()
X_scaled = x_scaler.fit_transform(X)
y_scaled = y_scaler.fit_transform(y)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.2, random_state=0)
model = Sequential()
model.add(Dense(100, input_dim = 2, activation = 'relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation = 'relu'))
model.compile(optimizer = 'Adam', loss = 'mean_squared_error')
epochs_hist = model.fit(X_train, y_train, epochs = 100, batch_size = 50, validation_split = 0.2)
Now that I fit my model, I will use the meshgrid for several intervals in order to find the optimal in the specified range:
x1_pred = np.linspace(0,20,21)
x2_pred = np.linspace(0,20,21)
X_pred = pd.DataFrame((product(*[x1_pred,x2_pred])))
X_pred_test = x_scaler.fit_transform(X_pred)
y_pred = model.predict(X_pred_test)
y_pred = y_scaler.inverse_transform(y_pred)
So, supposing that I doing something similar for reaching the optimal, but in this case with 9 inputs, then it is clear how computionally unfeasiable that calculation will be. Hence, it comes my question of how to find the optimal combination of inputs which return maximal output of a blackbox model such as ANN.
答案1
得分: 2
以下是您要翻译的代码部分:
这是如何从模型中获得“最佳结果”的示例。重要的部分是`optimize`、`_get_simplex`和`_call_model`。通过这种方式,您可以减少对模型的必要调用次数。
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np
from scipy.optimize import minimize
from copy import copy
class Example:
def __init__(self):
self.X = np.random.random((10000, 9))
self.y = self.get_y()
self.clf = GradientBoostingRegressor()
self.fit()
def get_y(self):
# 平方和,在x = [0, 0, 0, 0, 0 ... ]时最小
return np.array([[self._func(i)] for i in self.X])
def _func(self, i):
return sum(i * i)
def fit(self):
self.clf.fit(self.X, self.y)
def optimize(self):
x0 = [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
initial_simplex = self._get_simplex(x0, 0.1)
result = minimize(fun=self._call_model,
x0=np.array(x0),
method='Nelder-Mead',
options={'xatol': 0.1,
'initial_simplex': np.array(initial_simplex)})
return result
def _get_simplex(self, x0, step):
simplex = []
for i in range(len(x0)):
point = copy(x0)
point[i] -= step
simplex.append(point)
point2 = copy(x0)
point2[-1] += step
simplex.append(point2)
return simplex
def _call_model(self, x):
prediction = self.clf.predict([x])
return prediction[0]
example = Example()
result = example.optimize()
print(result)
当然,如果您想要最大化而不是最小化,您可以返回`-prediction[0]`而不是`prediction[0]`以欺骗scipy。
英文:
Here's an example of how you could get the 'best result' from a model. The important parts are optimize
, _get_simplex
and _call_model
. By doing it this way you reduce the amount of calls necessary to your model.
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np
from scipy.optimize import minimize
from copy import copy
class Example:
def __init__(self):
self.X = np.random.random((10000, 9))
self.y = self.get_y()
self.clf = GradientBoostingRegressor()
self.fit()
def get_y(self):
# sum of squares, is minimum at x = [0, 0, 0, 0, 0 ... ]
return np.array([[self._func(i)] for i in self.X])
def _func(self, i):
return sum(i * i)
def fit(self):
self.clf.fit(self.X, self.y)
def optimize(self):
x0 = [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
initial_simplex = self._get_simplex(x0, 0.1)
result = minimize(fun=self._call_model,
x0=np.array(x0),
method='Nelder-Mead',
options={'xatol': 0.1,
'initial_simplex': np.array(initial_simplex)})
return result
def _get_simplex(self, x0, step):
simplex = []
for i in range(len(x0)):
point = copy(x0)
point[i] -= step
simplex.append(point)
point2 = copy(x0)
point2[-1] += step
simplex.append(point2)
return simplex
def _call_model(self, x):
prediction = self.clf.predict([x])
return prediction[0]
example = Example()
result = example.optimize()
print(result)
Of course, if you want to maximize instead of minimize, you can return -prediction[0]
instead of prediction[0]
to trick scipy.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论