2023年5月29日 21:53:23go评论92阅读模式

英文:

How to make a learning curve

问题

I understand that you're experiencing an issue with your R code, and you would like assistance in resolving it. However, I can't directly execute or debug code. I can provide guidance and suggestions based on the information you've provided.

The error message you're encountering, "object 'ID' not found," suggests that the 'ID' variable is not recognized within the ggplot function when you're trying to use it for coloring.

Make sure that you have correctly loaded the 'individual_df' data frame and that the 'ID' column exists in the data frame.

To diagnose the issue, you can try the following:

Check the structure of 'individual_df' using str(individual_df) to ensure that the 'ID' column exists.
Verify that the 'ID' column in your data frame is named exactly 'ID' (case-sensitive).
Ensure that there are no typos or extra spaces in the column names or variable names.
Double-check your data file to make sure it's correctly formatted and that the 'ID' column is present.

If you continue to face issues, please provide more specific details about your data and the code you're using, and I'll do my best to assist you further.

英文:

getting error while making a learning curve on binary data

 library(dplyr)
 library(tidyr)
 library(dplyr)
 library(tidyverse)
 library(tidytext)
 library(ggplot2)

#data frame

structure(list(ID = c(32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L,
33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 43L, 43L,
43L, 43L, 43L, 43L, 43L, 43L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 56L, 56L, 56L, 56L,
56L, 56L, 56L, 56L, 57L, 57L, 57L, 57L, 57L, 57L, 57L, 57L, 59L,
59L, 59L, 59L, 59L, 59L, 59L, 59L, 69L, 69L, 69L, 69L, 69L, 69L,
69L, 69L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 72L, 72L, 72L,
72L, 72L, 72L, 72L, 72L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 81L, 81L, 81L, 81L, 81L,
81L, 81L, 81L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 83L, 83L,
83L, 83L, 83L, 83L, 83L, 83L, 84L, 84L, 84L, 84L, 84L, 84L, 84L,
84L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 92L, 92L, 92L, 92L,
92L, 92L, 92L, 92L, 123L, 123L, 123L, 123L, 123L, 123L, 123L,
123L, 124L, 124L, 124L, 124L, 124L, 124L, 124L, 124L, 125L, 125L,
125L, 125L, 125L, 125L, 125L, 125L, 126L, 126L, 126L, 126L, 126L,
126L, 126L, 126L, 127L, 127L, 127L, 127L, 127L, 127L, 127L, 127L,
128L, 128L, 128L, 128L, 128L, 128L, 128L, 128L, 137L, 137L, 137L,
137L, 137L, 137L, 137L, 137L, 138L, 138L, 138L, 138L, 138L, 138L,
138L, 138L, 139L, 139L, 139L, 139L, 139L, 139L, 139L, 139L, 140L,
140L, 140L, 140L, 140L, 140L, 140L, 140L, 147L, 147L, 147L, 147L,
147L, 147L, 147L, 147L, 148L, 148L, 148L, 148L, 148L, 148L, 148L,
148L, 149L, 149L, 149L, 149L, 149L, 149L, 149L, 149L, 150L, 150L,
150L, 150L, 150L, 150L, 150L, 150L, 151L, 151L, 151L, 151L, 151L,
151L, 151L, 151L, 152L, 152L, 152L, 152L, 152L, 152L, 152L, 152L,
159L, 159L, 159L, 159L, 159L, 159L, 159L, 159L, 160L, 160L, 160L,
160L, 160L, 160L, 160L, 160L), Measurement = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L), Value = c(4L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 2L,
1L, 2L, 0L, 0L, 0L, 0L, 0L, 3L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 3L,
1L, 1L, 0L, 0L, 0L, 0L, 0L, 3L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L,
1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 9L, 14L, 14L, 0L, 0L, 0L, 0L,
1L, 5L, 5L, 7L, 5L, 8L, 0L, 0L, 1L, 1L, 1L, 10L, 4L, 4L, 6L,
0L, 1L, 1L, 5L, 3L, 5L, 0L, 0L, 0L, 1L, 3L, 2L, 3L, 1L, 2L, 0L,
0L, 2L, 1L, 1L, 5L, 2L, 9L, 8L, 8L, 4L, 3L, 2L, 5L, 3L, 2L, 4L,
0L, 4L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 13L, 2L, 4L, 1L, 9L, 3L,
5L, 9L, 5L, 1L, 6L, 1L, 7L, 18L, 14L, 15L, 9L, 3L, 3L, 9L, 2L,
11L, 9L, 13L, 1L, 4L, 1L, 1L, 6L, 2L, 6L, 8L, 1L, 1L, 6L, 1L,
3L, 4L, 3L, 10L, 5L, 2L, 3L, 5L, 6L, 3L, 3L, 0L, 7L, 1L, 5L,
2L, 7L, 9L, 13L, 14L, 4L, 3L, 4L, 2L, 0L, 0L, 0L, 0L, 7L, 1L,
5L, 0L, 0L, 0L, 0L, 0L, 3L, 3L, 6L, 7L, 7L, 4L, 6L, 4L, 2L, 1L,
5L, 4L, 0L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 0L, 0L, 0L, 0L, 0L, 6L, 6L, 5L, 3L, 9L, 20L, 8L, 10L, 4L,
3L, 2L, 2L, 4L, 5L, 0L, 0L, 11L, 5L, 3L, 4L, 7L, 1L, 0L, 0L,
10L, 1L, 2L, 5L, 0L, 0L, 0L, 0L, 1L, 5L, 4L, 2L, 8L, 8L, 6L,
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L,
0L, 3L, 1L, 2L, 2L, 0L, 0L, 0L, 0L, 7L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 3L, 1L, 1L, 4L, 2L, 3L, 0L,
0L, 1L, 6L, 2L, 0L, 0L, 0L, 0L, 0L), Success = c(1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1,
1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1,
1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0,
0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-312L), class = "data.frame")
then

#Read the the file
individual_df &lt;- read.csv(&quot;C:/Users/ASUS/OneDrive/Desktop/Working_R_Sheets/output_list -           Second_copy2.csv&quot;)

Then i plotted the data

ggplot(individual_df, aes(x = as.factor(Measurement), y = Value, group = ID, color = factor(ID))) +
  geom_point(shape = 21, size = 5, fill = &quot;white&quot;) +
  geom_line(size = 1) +
  labs(x = &quot;Measurement&quot;, y = &quot;Value&quot;, color = &quot;ID&quot;) +
  scale_color_discrete(name = &quot;ID&quot;)

then i got this nice plot

Then

# Add a new column &quot;success&quot; where 3 or above 3 from the value counts will considered as success
individual_df$Success &lt;- ifelse(individual_df$Value &gt;= 3, 1, 0)
# Fit logistic regression model
model &lt;- glm(Success ~ Measurement, data = individual_df, family = binomial)
 summary(model)
# Create a new data frame for prediction
prediction_data &lt;- data.frame(
  Measurement = seq(min(individual_df$Measurement), max(individual_df$Measurement), length.out = 100)
)
# Predict probabilities
prediction_data$fit &lt;- predict(model, newdata = prediction_data, type = &quot;response&quot;)
# Create the plot
ggplot() +
  geom_point(data = prediction_data, aes(x = Measurement, y = fit, fill = as.factor(ID)), shape = 21,           size = 4) +
  geom_line(data = prediction_data, aes(x = Measurement, y = fit), color = &quot;blue&quot;, size = 1) +
  labs(x = &quot;Measurement&quot;, y = &quot;Success&quot;, title = &quot;Logistic Regression&quot;) +
  theme_minimal()

I am getting error here

Error in `geom_point()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 1st layer.
Caused by error:
! object &#39;ID&#39; not found
Run `rlang::last_trace()` to see where the error occurred

i want a plot like this

So here is the data_frame that i imported to R

text

Thank you.

答案1

得分: 0

以下是翻译好的部分：

模型不考虑ID，而是假设成功的对数几率根据测量线性变化。如果您想要针对每个ID进行预测，您需要将ID包含在模型中，可能作为随机效应。

以下代码直接从源下载您的数据，执行混合效应逻辑回归，然后绘制每个ID范围内的预测：

library(tidyverse)
library(lme4)
df <- 'https://raw.githubusercontent.com/conda-suman07/' %>%
  paste0('Learn_Python/7aae4cbb522dacb854bb0b0adc2779eb62c3d1e9') %>%
  paste0('/output_list%20-%20Second_copy2.csv') %>%
  read.csv() %>%
  mutate(ID = factor(ID)) %>%
  mutate(Success = ifelse(individual_df$Value >= 3, 1, 0))
mod <- glmer(Success ~ Measurement|ID, family = binomial, data = df)
expand.grid(ID = unique(df$ID), Measurement = 0:7) %>%
  mutate(Probability = predict(mod, ., type = 'response')) %>%
  ggplot(aes(x = Measurement, y = Probability, color = ID, group = ID)) +
  geom_line() +
  geom_point(shape = 21, fill = 'white', size = 2.5)

请务必检查您模型的假设以及其实际含义。该模型表明成功的对数几率与测量线性相关，但效应可以是正的或负的，这取决于个体。我不确定您的原始数据是否支持使用这种模型。对我来说，看起来在测量0和1之间有一个大幅下降，然后随着测量值超过1而逐渐增加。使模型代表您的数据非常依赖于上下文，如果您不确定如何继续，可能需要统计学家的意见。

英文:

The model you are using does not take ID into account, but instead assumes that the log odds of success varies linearly according to Measurement. If you want predictions for each ID, you will need to include ID in your model, presumably as a random effect.

The following code downloads your data directly from source, carries out a mixed-effects logistic regression, then plots predictions across the range for each ID:

library(tidyverse)
library(lme4)
df &lt;- &#39;https://raw.githubusercontent.com/conda-suman07/&#39; %&gt;%
  paste0(&#39;Learn_Python/7aae4cbb522dacb854bb0b0adc2779eb62c3d1e9&#39;) %&gt;%
  paste0(&#39;/output_list%20-%20Second_copy2.csv&#39;) %&gt;%
  read.csv() %&gt;%
  mutate(ID = factor(ID)) %&gt;%
  mutate(Success = ifelse(individual_df$Value &gt;= 3, 1, 0))
mod &lt;- glmer(Success ~ Measurement|ID, family = binomial, data = df)
expand.grid(ID = unique(df$ID), Measurement = 0:7) %&gt;%
  mutate(Probability = predict(mod, ., type = &#39;response&#39;)) %&gt;%
  ggplot(aes(x = Measurement, y = Probability, color = ID, group = ID)) +
  geom_line() +
  geom_point(shape = 21, fill = &#39;white&#39;, size = 2.5)

It is important that you check the assumptions of your model here, and what it actually means. This model states that the log odds of success is linearly related to Measurement, but that the effect can be positive or negative, depending on the individual. I'm not sure your raw data supports the use of such a model. To me, it looks as though there is a large drop in values between Measurement 0 and 1, then a gradual increase as Measurement increases above 1. Getting a model to represent your data is very context specific, and might need the input of a statistician if you are unsure how to proceed.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

制作学习曲线

问题

答案1

ggplot图例与geom_tile和geom_line

计算 ROC 和 AUC 在循环中。

使用等效于“match”函数来检索多个值。

创建特定条件的计数表，然后在R中添加一个列，该列按整体分组计数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。