2023年2月16日 19:36:33go评论93阅读模式

英文:

Finding the precision, recall and the f1 in R

问题

I want to run models on a loop via and then store the performance metrics into a table. I do not want to use the confusionMatrix function in caret, but I want to compute the precision, recall and f1 and then store those in a table. Please assist, edits to the code are welcome.
My attempt is below.

library(MASS) #will load our biopsy data
library(caret)
data("biopsy")
biopsy$ID <- NULL
names(biopsy) <- c('clump thickness','uniformity cell size','uniformity cell shape',
                 'marginal adhesion','single epithelial cell size','bare nuclei',
                 'bland chromatin','normal nuclei','mitosis','class')
sum(is.na(biopsy))
biopsy <- na.omit(biopsy)
sum(is.na(biopsy))
head(biopsy, 5)
set.seed(123)
inTraining <- createDataPartition(biopsy$class, p = .75, list = FALSE)
training <- biopsy[inTraining,]
testing  <- biopsy[-inTraining,]
# Run algorithms using 10-fold cross validation
control <- trainControl(method="repeatedcv", number=10, repeats = 5, verboseIter = FALSE, classProbs = TRUE)
# CHANGING THE CHARACTERS INTO FACTORS VARIABLES
training <- as.data.frame(unclass(training),                     
                         stringsAsFactors = TRUE)
# CHANGING THE CHARACTERS INTO FACTORS VARIABLES
testing <- as.data.frame(unclass(testing),                     
                         stringsAsFactors = TRUE)
models <- c("svmRadial", "rf")
results_table <- data.frame(models = models, stringsAsFactors = FALSE)
for (i in models){
  model_train <- train(class ~ ., data = training, method = i,
                     trControl = control, metric = "Accuracy")
  predictions <- predict(model_train, newdata = testing)
  precision_ <- posPredValue(predictions, testing)
  recall_ <- sensitivity(predictions, testing)
  f1 <- (2 * precision_ * recall_) / (precision_ + recall_)
  # put that in the results table
  results_table[i, "Precision"] <- precision_
  results_table[i, "Recall"] <- recall_
  results_table[i, "F1score"] <- f1
}

However, I get an error which says Error in posPredValue.default(predictions, testing) : inputs must be factors. I do not know where I went wrong, and any edits to my code are welcome.

I know that I could get precision, recall, f1 by just using the code below (B), however, this is a tutorial question where I am required not to use the code example below (B):

(B)
for (i in models){
  model_train <- train(class ~ ., data = training, method = i,
                     trControl = control, metric = "Accuracy")
  predictions <- predict(model_train, newdata = testing)
  print(confusionMatrix(predictions, testing$class, mode = "prec_recall"))
}

英文:

library(MASS) #will load our biopsy data
library(caret)
data(&quot;biopsy&quot;)
biopsy$ID&lt;-NULL
names(biopsy)&lt;-c(&#39;clump thickness&#39;,&#39;uniformity cell size&#39;,&#39;uniformity cell shape&#39;,
                 &#39;marginal adhesion&#39;,&#39;single epithelial cell size&#39;,&#39;bare nuclei&#39;,
                 &#39;bland chromatin&#39;,&#39;normal nuclei&#39;,&#39;mitosis&#39;,&#39;class&#39;)
sum(is.na(biopsy))
biopsy&lt;-na.omit(biopsy)
sum(is.na(biopsy))
head(biopsy,5)
set.seed(123)
inTraining &lt;- createDataPartition(biopsy$class, p = .75, list = FALSE)
training &lt;- biopsy[ inTraining,]
testing  &lt;- biopsy[-inTraining,]
# Run algorithms using 10-fold cross validation
control &lt;- trainControl(method=&quot;repeatedcv&quot;, number=10,repeats = 5, verboseIter = F, classProbs = T)
#CHANGING THE CHARACTERS INTO FACTORS VARAIBLES
training&lt;- as.data.frame(unclass(training),                     
                         stringsAsFactors = TRUE)
#CHANGING THE CHARACTERS INTO FACTORS VARAIBLES
testing &lt;- as.data.frame(unclass(testing),                     
                         stringsAsFactors = TRUE)
models&lt;-c(&quot;svmRadial&quot;,&quot;rf&quot;)
results_table &lt;- data.frame(models = models, stringsAsFactors = F)
for (i in models){
  model_train&lt;-train(class~., data=training, method=i,
                     trControl=control,metric=&quot;Accuracy&quot;)
  predictions&lt;-predict(model_train, newdata=testing)
  precision_&lt;-posPredValue(predictions,testing)
  recall_&lt;-sensitivity(predictions,testing)
  f1&lt;-(2*precision_*recall_)/(precision_+recall_)
  # put that in the results table
  results_table[i, &quot;Precision&quot;] &lt;- precision_
  results_table[i, &quot;Recall&quot;] &lt;- recall_
  results_table[i, &quot;F1score&quot;] &lt;- f1
}

However I get an error which says Error in posPredValue.default(predictions, testing) : inputs must be factors. i do not know where I went wrong and any edits to my code are welcome.

I know that I could get precision,recall, f1 by just using the code below (B), however this is a tutorial question where I am required not to use the code example below (B):

(B)
for (i in models){
  model_train&lt;-train(class~., data=training, method=i,
                     trControl=control,metric=&quot;Accuracy&quot;)
  predictions&lt;-predict(model_train, newdata=testing)
  print(confusionMatrix(predictions, testing$class,mode=&quot;prec_recall&quot;))
}

答案1

得分: 1

需要发生一些事情。

您需要更改posPredValue和sensitivity的函数调用。对于两者，将testing更改为testing$class。
对于results_table，i是一个_单词_，而不是一个值，所以您正在分配results_table["rf", "Precision"] <- precision_（这会创建一个新行，行名为"rf"）。

以下是您的for语句，其中包括对1)中提到的函数的更改以及解决2)中问题的修改。

for (i in models){
  model_train <- train(class~., data = training, method = i,
                     trControl= control, metric = "Accuracy")
  assign("fit", model_train)
  predictions <- predict(model_train, newdata = testing)
  precision_ <- posPredValue(predictions, testing$class)
  recall_ <- sensitivity(predictions, testing$class)
  f1 <- (2*precision_ * recall_) / (precision_ + recall_)
  
  # 将这些值放入结果表
  results_table[results_table$models %in% i, "Precision"] <- precision_
  results_table[results_table$models %in% i, "Recall"] <- recall_
  results_table[results_table$models %in% i, "F1score"] <- f1
}

这是对我而言的样子。

results_table
#      models Precision    Recall   F1score
# 1 svmRadial 0.9722222 0.9459459 0.9589041
# 2        rf 0.9732143 0.9819820 0.9775785

英文:

A few things need to happen.

You have to change the function calls for posPredValue and sensitivity. For both, change testing to testing$class.
for the results_table, i is a word, not a value, so you're assigning results_table["rf", "Precision"] <- precision_ (This makes a new row, where the row name is "rf".)

Here is your for statement, with changes to those functions mentioned in 1) and a modification to address the issue in 2).

for (i in models){
  model_train &lt;- train(class~., data = training, method = i,
                     trControl= control, metric = &quot;Accuracy&quot;)
  assign(&quot;fit&quot;, model_train)
  predictions &lt;- predict(model_train, newdata = testing)
  precision_ &lt;-posPredValue(predictions, testing$class)
  recall_ &lt;- sensitivity(predictions, testing$class)
  f1 &lt;- (2*precision_ * recall_) / (precision_ + recall_)
  
  # put that in the results table
  results_table[results_table$models %in% i, &quot;Precision&quot;] &lt;- precision_
  results_table[results_table$models %in% i, &quot;Recall&quot;] &lt;- recall_
  results_table[results_table$models %in% i, &quot;F1score&quot;] &lt;- f1
  }

This is what it looks like for me.

results_table
#      models Precision    Recall   F1score
# 1 svmRadial 0.9722222 0.9459459 0.9589041
# 2        rf 0.9732143 0.9819820 0.9775785

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中查找精确度、召回率和F1分数。

问题

答案1

使用模糊字符串匹配（stringdist_join()）在文本字符串上连接数据框。

根据子数据框更新数据框

如何使用dcast并保留多年的数据？

如何对现有的数据变量应用更改。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。