英文:
How to use my pretrained LSTM saved model to make new classifications
问题
I have a simple pretrained LSTM model builded with Keras and Tensorflow, I trained, compiled and fitted it, and make a test prediction with a simple sentence, and it works, then I saved my model using model.save(sentanalysis.h5
and everything OK. Then, I loaded this model with model.load_model()
, and it loads without error, but when I tried model.predict()
I got an array with floats that doesn't shows anything related to the classes:
How can I use my pretrained model to make new classifications?
The dataset I use to train it is very simple, a csv with text
and sentiment
columns, nothing else.
Can you help me?
This is the code of the model:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences
dataset = nlp.load_dataset('csv', data_files={'train':'/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv',
'test': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv',
'validation': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv'})
train = dataset['train']
val = dataset['validation']
test = dataset['test']
def get_tweet(data):
tweets = [x['Text'] for x in data]
labels = [x['behavior'] for x in data]
return tweets, labels
tweets, labels = get_tweet(train)
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(tweets)
maxlen = 140
def get_sequences(tokenizer, tweets):
sequences = tokenizer.texts_to_sequences(tweets)
padded = pad_sequences(sequences, truncating='post', padding='post', maxlen=maxlen)
return padded
padded_train_seq = get_sequences(tokenizer, tweets)
classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate classes)
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)
model = tf.keras.models.Sequential([
tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20)),
tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(
loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
padded_train_seq, train_labels,
validation_data=(val_seq, val_labels),
epochs=8
)
test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
test_labels=names_to_ids(test_labels)
model.evaluate(test_seq, test_labels)
# This code works when I loaded the previous code
sentence = 'I am very happy now'
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating='post', padding='post', maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
pred_class=index_to_class[np.argmax(p).astype('uint8')]
print('Sentence: ', sentence)
print('Sentiment: ', pred_class)
And this is how I save and load my model without loading the previous code:
model.save('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
model = keras.models.load_model('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
#### ISSUE HERE
new = ["I am very happy"]
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(new)
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)
And I get this:
1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
8.5460022e-04]]
I've reading some doc, but nothing helped me.
英文:
I have a simple pretrained LSTM model builded with Keras and Tensorflow, I trained, compiled and fitted it, and make a test prediction with a simple sentence, and it works, then I saved my model using model.save(sentanalysis.h5
and everything OK. Then, I loaded this model with model.load_model()
, and it loads without error, but when I tried model.predict()
I got an array with floats that doesn't shows anything related to the classes:
How can I use my pretrained model to make new classifications?
The dataset I use to train it is very simple, a csv with text
and sentiment
columns, nothing else.
Can you help me?
This is the code of the model:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences
dataset = nlp.load_dataset('csv', data_files={'train':'/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv',
'test': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv',
'validation': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv'})
train = dataset['train']
val = dataset['validation']
test = dataset['test']
def get_tweet(data):
tweets = [x['Text'] for x in data]
labels = [x['behavior'] for x in data]
return tweets, labels
tweets, labels = get_tweet(train)
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(tweets)
maxlen = 140
def get_sequences(tokenizer, tweets):
sequences = tokenizer.texts_to_sequences(tweets)
padded = pad_sequences(sequences, truncating='post', padding='post', maxlen=maxlen)
return padded
padded_train_seq = get_sequences(tokenizer, tweets)
classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate(classes))
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)
model = tf.keras.models.Sequential([
tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20)),
tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(
loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
padded_train_seq, train_labels,
validation_data=(val_seq, val_labels),
epochs=8#,
#callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=2)]
)
test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
test_labels=names_to_ids(test_labels)
model.evaluate(test_seq, test_labels)
# This code works when I loaded the previos code
sentence = 'I am very happy now'
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating = 'post', padding='post', maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
pred_class=index_to_class[np.argmax(p).astype('uint8')]
print('Sentence: ', sentence)
print('Sentiment: ', pred_class)
And this is how I save and load my model withouth loading previous code:
model.save('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
model = keras.models.load_model('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
#### ISSUE HERE
new = ["I am very happy"]
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(new)
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)
And I get this:
1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
8.5460022e-04]]
I've reading some doc, but nothing helped me.
答案1
得分: 0
根据您的模型代码,您有以下内容:
tf.keras.layers.Dense(6, activation='softmax')
可以推测您有6种不同的情感类别。从model.predict()
中得到的输出是输入属于相应类别的概率,即情感为类别0的概率为70.6%,类别1的概率为11.5%,类别2的概率为17.5%,依此类推。
通常对这些结果进行后处理的方法是使用np.argmax(pred)
来选择具有最大概率的类别作为预测结果,在您的示例中应该返回0
,这可以被解释为您的模型认为您的推文有70.6%的概率属于类别零。
英文:
So, from your model code, you have the following:
tf.keras.layers.Dense(6, activation='softmax')
Presumably, you have 6 different sentiment classes. The output you are seeing from your model.predict()
are the probabilities that the input belongs to the corresponding class, i.e. 70.6% chance that the sentiment is class 0, 11.5% that the sentiment is class 1, 17.5% that the sentiment is class 2, etc.
So what is typically done to postprocess these results is take the largest probability as the prediction using np.argmax(pred)
, which in the case you posted should give you 0
, which then can be interpreted as your model believes your tweet is 70.6% likely to belong to class zero.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论