2023年7月23日 21:41:59go评论88阅读模式

英文:

How to impute a conditional row-wise imputation of a constant

问题

我是一个R新手，正在尝试编写似乎很简单的逻辑代码，但遇到了困难，希望能得到帮助！我正在尝试在我的数据集中为每行中的NA单元格填充常数值1，但仅针对包含2个或更少NA单元格的行。最终，我还将在填充后计算一列新的行均值。如果一行代码可以自动完成所有这些任务，那将非常好！

这是一个示例数据集供您参考。

tData <- data.frame(subID=c(1001,1002,1003,1004),
b1=c(1,1,2,NA),
b2=c(NA,1,1,NA),
b3=c(NA,2,2,NA),
b4=c(2,NA,1,NA))

我已经查看了各种基础和dplyr代码示例，但仍然困扰不解。

英文:

I am somewhat of an R newbie, am struggling with writing code for what seems like simple logic, and would appreciate any help! I am trying to impute a constant value of 1 for NA cells in each row of my data set but only for rows that have 2 or less NA cells. Ultimately, I will also be computing a new column with row-wise means after imputation. If one line of code code automagically achieve all of these things, that would be great!

Here is an example data set to work with.

tData &lt;- data.frame(subID=c(1001,1002,1003,1004),
b1=c(1,1,2,NA),
b2=c(NA,1,1,NA),
b3=c(NA,2,2,NA),
b4=c(2,NA,1,NA))

I have been looking at various base and dplyr code examples but am riding the struggle bus.

答案1

得分: 2

你可以在以下两行代码中完成此操作。

tData[is.na(tData) & rowSums(is.na(tData)) <= 2] <- 1
tData |>
  cbind(row_means=rowMeans(tData[-1]))

数据:

tData <- structure(list(subID = c(1001, 1002, 1003, 1004), b1 = c(1, 1, 2, NA), b2 = c(NA, 1, 1, NA), b3 = c(NA, 2, 2, NA), b4 = c(2, NA, 1, NA)), class = "data.frame", row.names = c(NA, -4L))

英文:

You can do this in these two lines.

tData[is.na(tData) &amp; rowSums(is.na(tData)) &lt;= 2] &lt;- 1
tData |&gt; cbind(row_means=rowMeans(tData[-1]))
#   subID b1 b2 b3 b4 row_means
# 1  1001  1  1  1  2      1.25
# 2  1002  1  1  2  1      1.25
# 3  1003  2  1  2  1      1.50
# 4  1004 NA NA NA NA        NA

Data:

tData &lt;- structure(list(subID = c(1001, 1002, 1003, 1004), b1 = c(1, 1, 
2, NA), b2 = c(NA, 1, 1, NA), b3 = c(NA, 2, 2, NA), b4 = c(2, 
NA, 1, NA)), class = &quot;data.frame&quot;, row.names = c(NA, -4L))

答案2

得分: 0

我们可以这样做：

library(dplyr)
tData %>%
  mutate(across(-subID, ~ifelse(rowSums(is.na(tData[2:5])) <= 2 & is.na(.), 1, .))) %>%
  rowwise() %>%
  mutate(mean_value = mean(c_across(-subID), na.rm = TRUE))

 subID    b1    b2    b3    b4 mean_value
  <dbl> <dbl> <dbl> <dbl> <dbl>      <dbl>
1  1001     1     1     1     2       1.25
2  1002     1     1     2     1       1.25
3  1003     2     1     2     1       1.5 
4  1004    NA    NA    NA    NA     NaN

英文:

We can do this like this:

library(dplyr)
tData %&gt;% 
  mutate(across(-subID, ~ifelse(rowSums(is.na(tData[2:5])) &lt;= 2 &amp; is.na(.), 1, .))) %&gt;%
  rowwise() %&gt;%
  mutate(mean_value = mean(c_across(-subID), na.rm = TRUE))

 subID    b1    b2    b3    b4 mean_value
  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;      &lt;dbl&gt;
1  1001     1     1     1     2       1.25
2  1002     1     1     2     1       1.25
3  1003     2     1     2     1       1.5 
4  1004    NA    NA    NA    NA     NaN

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何对条件行进行常数的逐行填充。

问题

答案1

答案2

相对于图例中的文本，符号的垂直调整

Dplyr可以将一个数据框传递给table()函数吗？

我怎样让glht函数打印使用的自由度？

在R的Plotly动画中，连接点的线段消失。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。