英文:
Informer: loss always Nan
问题
我尝试使用infomer模型来预测我的数据集。但是当我将训练数据集更改为我的数据集时,虽然程序可以运行,但我的损失一直是NaN,并且在训练后没有预测值。
我打印了train_loss、vali_loss和test_loss。它们的值都是NaN。
Epoch: 1, Steps: 739 | Train Loss: nan Vali Loss: nan Test Loss: nan
我查看了test_loss的值如下:
test_loss = {float32} nan
time_now = {float} 1678378741.1124253
strides = {tuple: 0} ()
size = {int} 1
shape = {tuple: 0} ()
ndim = {int} 0
real = {float32} nan
nbytes = {int} 4
itemsize = {int} 4
imag = {float32} 0.0
flat = {flatiter: 1} <numpy.flatiter object at 0x0000021FA9C06040>
flags = {flagsobj} C_CONTIGUOUS: True F_CONTIGUOUS: True OWNDATA: True WRITEABLE: False ALIGNED: True WRITEBACKIFCOPY: False
dtype = {dtype[float32]: 0} float32
data = {memoryview: 1} <memory at 0x0000021ECC6AFB80>
base = {NoneType} None
T = {float32} nan
你可以看到其中有很多NaN,当我运行完后输出的MSE和Mae也是NaN。
这是我的损失计算代码,'pred' 全是 NaN:
epoch_time = time.time()
for i, (batch_x,batch_y,batch_x_mark,batch_y_mark) in enumerate(train_loader):
iter_count += 1
model_optim.zero_grad()
pred, true = self._process_one_batch(
train_data, batch_x, batch_y, batch_x_mark, batch_y_mark)
loss = criterion(pred, true)
train_loss.append(loss.item())
...
print("Epoch: {} cost time: {}".format(epoch+1, time.time()-epoch_time))
train_loss = np.average(train_loss)
vali_loss = self.vali(vali_data, vali_loader, criterion)
test_loss = self.vali(test_data, test_loader, criterion)
我困惑的地方在于,当我使用模型最初提供的数据集时,程序能够正常工作,所有数据都是正常的,没有NaN。它还可以预测结果。
这是原始数据集的列:
date Visibility DryBulbFarenheit DryBulbCelsius WetBulbFarenheit DewPointFarenheit DewPointCelsius DewPointCelsius RelativeHumidity WindSpeed WindDirection StationPressure Altimeter WetBulbCelsius(target)
这是我的数据集的列:
date hight wind_speed wind_direction temperature humidity atmospheric_pressure(target)
我想知道为什么原始数据集可以正常运行而不报错。但是当我运行我的数据集时出现错误。问题出在哪里?为什么我的损失始终为NaN,无法预测数据。
英文:
I try to use the infomer model to predict my own dataset.But when I change the training dataset to my dataset.Although the program can run, my loss has always been Nan, and there are no predicted values after the training.
I print train_loss,vali_loss and test_loss.The value of them is all nan.
Epoch: 1, Steps: 739 | Train Loss: nan Vali Loss: nan Test Loss: nan
I looked at the value of test_loss as follows
test_loss = {float32} nan
time_now = {float} 1678378741.1124253
strides = {tuple: 0} ()
size = {int} 1
shape = {tuple: 0} ()
ndim = {int} 0
real = {float32} nan
nbytes = {int} 4
itemsize = {int} 4
imag = {float32} 0.0
flat = {flatiter: 1} <numpy.flatiter object at 0x0000021FA9C06040>
flags = {flagsobj} C_CONTIGUOUS : True\n F_CONTIGUOUS : True\n OWNDATA : True\n WRITEABLE : False\n ALIGNED : True\n WRITEBACKIFCOPY : False\n
dtype = {dtype[float32]: 0} float32
data = {memoryview: 1} <memory at 0x0000021ECC6AFB80>
base = {NoneType} None
T = {float32} nan
You can see that a lot of them are Nan And the MSE and Mae that I output when I finish running are also Nan.
This is my loss calculation code,the ‘pred’ is all nan
epoch_time = time.time()
for i, (batch_x,batch_y,batch_x_mark,batch_y_mark) in enumerate(train_loader):
iter_count += 1
model_optim.zero_grad()
pred, true = self._process_one_batch(
train_data, batch_x, batch_y, batch_x_mark, batch_y_mark)
loss = criterion(pred, true)
train_loss.append(loss.item())
...
print("Epoch: {} cost time: {}".format(epoch+1, time.time()-epoch_time))
train_loss = np.average(train_loss)
vali_loss = self.vali(vali_data, vali_loader, criterion)
test_loss = self.vali(test_data, test_loader, criterion)
What i confuse is, when I used the dataset that the model originally provided, the program worked fine, all the data was fine, and there was no Nan. And it can also predict the outcome.
here is original dataset column
date Visibility DryBulbFarenheit DryBulbCelsius WetBulbFarenheit DewPointFarenheit DewPointCelsius DewPointCelsius RelativeHumidity WindSpeed WindDirection StationPressure Altimeter WetBulbCelsius(target)
And here is my dataset column
date hight wind_speed wind_direction temperature humidity atmospheric_pressure(target)
I would like to know why the original dataset can be run without error. However, an error occurs when I run my own dataset.where is the problem.Why is my loss always nan and can not predict data.
答案1
得分: 0
因为我的数据集中有一列数值完全相同,当我从最大值中减去均值并将其用作分母时,该列将始终为NaN。
英文:
Because there is one column in my dataset that has exactly the same value, and when I subtract the mean from the maximum and use it as a denominator, that column is always going to be Nan
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论