F1-Score和准确率用于文本相似度。

huangapple go评论67阅读模式
英文:

F1-Score and Accuracy for Text-Similarity

问题

我正在尝试理解在微调问答模型时如何计算文本之间的F1分数和准确率。

假设我们有以下内容:

labels = [我很好, 他出生于1995年, 埃菲尔铁塔, 狗]

preds = [我很好, 出生于1995年, 埃菲尔, 狗]

在这种情况下,很明显预测结果相当准确,但我如何计算F1分数呢?"dog"和"dogs"并不完全匹配,但它们非常相似。

英文:

I am trying to understand how to calculate F1-Score and accuracy between texts while fine-tuning a QA model.

Let's assume we have this:

labels = [I am fine, He was born in 1995, The Eiffel tower, dogs]

preds = [I am fine, born in 1995, Eiffel, dog]

In this case, it is clear that the predictions are pretty accurate, but how can I measure the F1-Score here? Dog and dogs are not an exact match, but they are very similar.

答案1

得分: 0

一种常用的文本相似度度量标准是Levenshtein距离或编辑距离,它衡量将一个字符串转换为另一个字符串所需的最小单字符编辑(插入、删除或替换)次数。

尝试实现下面的代码。根据您的需求调整threshold

import Levenshtein

def text_similarity_evaluation(labels, preds, threshold=0.8):
    tp, fp, fn = 0, 0, 0

    for label, pred in zip(labels, preds):
        similarity_score = 1 - Levenshtein.distance(label, pred) / max(len(label), len(pred))
        if similarity_score >= threshold:
            tp += 1
        else:
            fp += 1

    fn = len(labels) - tp

    precision = tp / (tp + fp)
    recall = tp / (tp + fn)
    f1_score = 2 * (precision * recall) / (precision + recall)

    return precision, recall, f1_score

# 示例用法
labels = ["I am fine", "He was born in 1995", "The Eiffel tower", "dogs"]
preds = ["I am fine", "born in 1995", "Eiffel", "dog"]

precision, recall, f1_score = text_similarity_evaluation(labels, preds, threshold=0.8)
print("Precision:", precision)
print("Recall:", recall)
print("F1-Score:", f1_score)
英文:

One popular metric for text similarity is the Levenshtein distance or edit distance, which measures the minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into another.

Try implementing below code. Adjust threshold as per your requirement.

import Levenshtein

def text_similarity_evaluation(labels, preds, threshold=0.8):
    tp, fp, fn = 0, 0, 0

    for label, pred in zip(labels, preds):
        similarity_score = 1 - Levenshtein.distance(label, pred) / max(len(label), len(pred))
        if similarity_score >= threshold:
            tp += 1
        else:
            fp += 1

    fn = len(labels) - tp

    precision = tp / (tp + fp)
    recall = tp / (tp + fn)
    f1_score = 2 * (precision * recall) / (precision + recall)

    return precision, recall, f1_score

# Example usage
labels = ["I am fine", "He was born in 1995", "The Eiffel tower", "dogs"]
preds = ["I am fine", "born in 1995", "Eiffel", "dog"]

precision, recall, f1_score = text_similarity_evaluation(labels, preds, threshold=0.8)
print("Precision:", precision)
print("Recall:", recall)
print("F1-Score:", f1_score)

huangapple
  • 本文由 发表于 2023年7月31日 18:20:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76802665.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定