如何使用我的预训练的LSTM保存模型进行新的分类。

huangapple go评论65阅读模式
英文:

How to use my pretrained LSTM saved model to make new classifications

问题

I have a simple pretrained LSTM model builded with Keras and Tensorflow, I trained, compiled and fitted it, and make a test prediction with a simple sentence, and it works, then I saved my model using model.save(sentanalysis.h5 and everything OK. Then, I loaded this model with model.load_model(), and it loads without error, but when I tried model.predict() I got an array with floats that doesn't shows anything related to the classes:

How can I use my pretrained model to make new classifications?
The dataset I use to train it is very simple, a csv with text and sentiment columns, nothing else.
Can you help me?
This is the code of the model:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences

dataset = nlp.load_dataset('csv', data_files={'train':'/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv',
                                              'test': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv',
                                              'validation': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv'})
train = dataset['train']
val = dataset['validation']
test = dataset['test']

def get_tweet(data):
    tweets = [x['Text'] for x in data]
    labels = [x['behavior'] for x in data]
    return tweets, labels

tweets, labels = get_tweet(train)

tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(tweets)

maxlen = 140

def get_sequences(tokenizer, tweets):
    sequences = tokenizer.texts_to_sequences(tweets)
    padded = pad_sequences(sequences, truncating='post', padding='post', maxlen=maxlen)
    return padded

padded_train_seq = get_sequences(tokenizer, tweets)

classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate classes)
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)

model = tf.keras.models.Sequential([
    tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20)),
    tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
     padded_train_seq, train_labels,
     validation_data=(val_seq, val_labels),
     epochs=8
)

test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
test_labels=names_to_ids(test_labels)
model.evaluate(test_seq, test_labels)

# This code works when I loaded the previous code
sentence = 'I am very happy now'
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating='post', padding='post', maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
pred_class=index_to_class[np.argmax(p).astype('uint8')]
print('Sentence: ', sentence)
print('Sentiment: ', pred_class)

And this is how I save and load my model without loading the previous code:

model.save('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
model = keras.models.load_model('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')

#### ISSUE HERE
new = ["I am very happy"]
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(new)
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)

And I get this:

1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
  8.5460022e-04]]

I've reading some doc, but nothing helped me.

英文:

I have a simple pretrained LSTM model builded with Keras and Tensorflow, I trained, compiled and fitted it, and make a test prediction with a simple sentence, and it works, then I saved my model using model.save(sentanalysis.h5 and everything OK. Then, I loaded this model with model.load_model(), and it loads without error, but when I tried model.predict() I got an array with floats that doesn't shows anything related to the classes:

How can I use my pretrained model to make new classifications?
The dataset I use to train it is very simple, a csv with text and sentiment columns, nothing else.
Can you help me?
This is the code of the model:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences
dataset = nlp.load_dataset(&#39;csv&#39;, data_files={&#39;train&#39;:&#39;/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv&#39;,
&#39;test&#39;: &#39;/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv&#39;,
&#39;validation&#39;: &#39;/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv&#39;})
train = dataset[&#39;train&#39;]
val = dataset[&#39;validation&#39;]
test = dataset[&#39;test&#39;]
def get_tweet(data):
tweets = [x[&#39;Text&#39;] for x in data]
labels = [x[&#39;behavior&#39;] for x in data]
return tweets, labels
tweets, labels = get_tweet(train)
tokenizer = Tokenizer(num_words=10000, oov_token=&#39;&lt;UNK&gt;&#39;)
tokenizer.fit_on_texts(tweets)
maxlen = 140
def get_sequences(tokenizer, tweets):
sequences = tokenizer.texts_to_sequences(tweets)
padded = pad_sequences(sequences, truncating=&#39;post&#39;, padding=&#39;post&#39;, maxlen=maxlen)
return padded
padded_train_seq = get_sequences(tokenizer, tweets)
classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate(classes))
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)
model = tf.keras.models.Sequential([
tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20)),
tf.keras.layers.Dense(6, activation=&#39;softmax&#39;)
])
model.compile(
loss=&#39;sparse_categorical_crossentropy&#39;,
optimizer=&#39;adam&#39;,
metrics=[&#39;accuracy&#39;]
)
val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
padded_train_seq, train_labels,
validation_data=(val_seq, val_labels),
epochs=8#,
#callbacks=[tf.keras.callbacks.EarlyStopping(monitor=&#39;val_accuracy&#39;, patience=2)]
)
test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
test_labels=names_to_ids(test_labels)
model.evaluate(test_seq, test_labels)
# This code works when I loaded the previos code
sentence = &#39;I am very happy now&#39;
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating = &#39;post&#39;, padding=&#39;post&#39;, maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
pred_class=index_to_class[np.argmax(p).astype(&#39;uint8&#39;)]
print(&#39;Sentence: &#39;, sentence)
print(&#39;Sentiment: &#39;, pred_class)

And this is how I save and load my model withouth loading previous code:

model.save(&#39;/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5&#39;)
model = keras.models.load_model(&#39;/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5&#39;)
#### ISSUE HERE
new = [&quot;I am very happy&quot;]
tokenizer = Tokenizer(num_words=10000, oov_token=&#39;&lt;UNK&gt;&#39;)
tokenizer.fit_on_texts(new)
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)

And I get this:

1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
8.5460022e-04]]

I've reading some doc, but nothing helped me.

答案1

得分: 0

根据您的模型代码,您有以下内容:

tf.keras.layers.Dense(6, activation='softmax')

可以推测您有6种不同的情感类别。从model.predict()中得到的输出是输入属于相应类别的概率,即情感为类别0的概率为70.6%,类别1的概率为11.5%,类别2的概率为17.5%,依此类推。

通常对这些结果进行后处理的方法是使用np.argmax(pred)来选择具有最大概率的类别作为预测结果,在您的示例中应该返回0,这可以被解释为您的模型认为您的推文有70.6%的概率属于类别零。

英文:

So, from your model code, you have the following:

tf.keras.layers.Dense(6, activation=&#39;softmax&#39;)

Presumably, you have 6 different sentiment classes. The output you are seeing from your model.predict() are the probabilities that the input belongs to the corresponding class, i.e. 70.6% chance that the sentiment is class 0, 11.5% that the sentiment is class 1, 17.5% that the sentiment is class 2, etc.

So what is typically done to postprocess these results is take the largest probability as the prediction using np.argmax(pred), which in the case you posted should give you 0, which then can be interpreted as your model believes your tweet is 70.6% likely to belong to class zero.

huangapple
  • 本文由 发表于 2023年2月24日 11:16:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75552310.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定