英文:
Set validation data in SpaCy NER training
问题
SpaCy NER 是否可以使用验证数据进行训练?或者像在 Keras 中一样将一些数据拆分为验证集(类似于 model.fit 中的 validation_split)?谢谢
with nlp.disable_pipes(*other_pipes): # 仅训练 NER
for itn in tqdm(range(n_iter)):
random.shuffle(train_data_list)
losses = {}
# 使用spaCy的小批量方法来批量处理示例
batches = minibatch(train_data_list, size=compounding(8., 64., 1.001))
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
losses=losses)
英文:
Is it possible to train SpaCy NER with validation data?
Or split some data to validation set like in Keras (validation_split in model.fit)? Thanks
with nlp.disable_pipes(*other_pipes): # only train NER
for itn in tqdm(range(n_iter)):
random.shuffle(train_data_list)
losses = {}
# batch up the examples using spaCy's minibatch
batches = minibatch(train_data_list, size=compounding(8., 64., 1.001))
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
losses=losses)
答案1
得分: 2
使用spacy train
CLI代替演示脚本:
spacy train lang /path/to/output train.json dev.json
验证数据用于从训练迭代中选择最佳模型,也可用于提前停止。
主要任务是将您的数据转换为spaCy的JSON训练格式,请参阅:https://stackoverflow.com/a/59209377/461847
英文:
Use the spacy train
CLI instead of the demo script:
spacy train lang /path/to/output train.json dev.json
The validation data is used to choose the best model from the training iterations and optionally for early stopping.
The main task is converting your data to spacy's JSON training format, see: https://stackoverflow.com/a/59209377/461847
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论