2023年8月4日 06:56:09go评论104阅读模式

英文:

In R: How to apply a function to a column given the value of another column

问题

我有一个包含1和NA的数据框。我想将NA替换为零，但如果整行都是NA，那么不替换，因为这表示真正的NA。

例如，这是一个简化的数据框：

A <- c(NA, NA, 1, 1)
B <- c(NA, NA, NA, 1)
C <- c(1, NA, NA, NA)
df <- data.frame(A, B, C)
df$D <- ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1, 0)

列A、B和C要么是1，要么是空白（NA）。我想用零（0）替换空白，但当A、B和C都为空白时不替换。我已经创建了列D作为指示器，表示A、B和C中是否有任何数据。现在我需要一段代码来替换NA为零。希望这样能明白我的意思。

我希望输出看起来像这样：

我使用以下代码来生成列D：

df$D <- ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1, 0)

英文:

I have a dataframe with 1s and NAs. I would like to replace the NAs with zeros but not if the entire row is NA as this would indicate a true NA.

For example, here is a simplified data frame:

A&lt;-c(NA,NA,1,1)
B&lt;-c(NA,NA,NA,1)
C&lt;-c(1,NA,NA,NA)
df&lt;-data.frame(A,B,C)
df$D&lt;-ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1,0)
A   B   C  
NA  NA  1  
NA  NA  NA  
1   NA  NA  
1   1   NA

Columns A B and C have either 1s or blanks (NA). I would like to replace the blanks with zeros (0), but NOT when A, B, and C are all blank. I have created column D as an indicator of whether or not there is any data in A B C. Now I need a code to replace NA with zero. I hope this makes sense.

I am hoping the output will look like this:

I used the following code to produce column D

df$D&lt;-ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1,0)

答案1

得分: 2

基于rowSums的结果的方法（受@Ritchie Sacramento的提示）

replace(df, rowSums(df, na.rm = T) > 0 & is.na(df), 0)
   A  B  C
1  0  0  1
2 NA NA NA
3  1  0  0
4  1  1  0

英文:

An approach based on the result of a rowSums (with hint from @Ritchie Sacramento)

replace(df, rowSums(df, na.rm = T) &gt; 0 &amp; is.na(df), 0)
   A  B  C
1  0  0  1
2 NA NA NA
3  1  0  0
4  1  1  0

答案2

得分: 1

一个简单的解决方案是捕获所有行都是`NA`的行，将所有的`NA`替换为零，然后再重新填充`NA`：
```r
all_na <- apply(is.na(df), 1, all)
df[is.na(df)] <- 0
df[all_na,] <- NA

否则，您可以尝试像这样做：

data.frame(t(apply(df, 1, \(x) if (all(is.na(x))) x else replace(x, is.na(x), 0))))
#    A  B  C
# 1  0  0  1
# 2 NA NA NA
# 3  1  0  0
# 4  1  1  0


<details>
<summary>英文:</summary>
A simple solution would be to capture the rows that are all `NA`, replace all the `NA` with zero, and then go back and re-populate the `NA`: 
```r
all_na &lt;- apply(is.na(df), 1, all)
df[is.na(df)] &lt;- 0
df[all_na,] &lt;- NA

Otherwise you can try it like this:

data.frame(t(apply(df, 1, \(x) if (all(is.na(x))) x else replace(x, is.na(x), 0))))
#    A  B  C
# 1  0  0  1
# 2 NA NA NA
# 3  1  0  0
# 4  1  1  0

答案3

得分: 0

示例数据框

df <- data.frame(A = c(NA, NA, 1, 1),
B = c(NA, NA, NA, 1),
C = c(1, NA, NA, NA))

创建列D

df$D <- ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1, 0)

根据列D在列A、B和C中用零替换缺失值

df[df$D == 1, c("A", "B", "C")] <- lapply(df[df$D == 1, c("A", "B", "C")], function(x) ifelse(is.na(x), 0, x))

打印修改后的数据框

print(df)

英文:

# Sample DataFrame
df &lt;- data.frame(A = c(NA, NA, 1, 1),
                 B = c(NA, NA, NA, 1),
                 C = c(1, NA, NA, NA))
# Create column D
df$D &lt;- ifelse(!is.na(df$A) | !is.na(df$B) | !is.na(df$C), 1, 0)
# Replace NAs with zeros in columns A, B, and C, based on column D
df[df$D == 1, c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;)] &lt;- lapply(df[df$D == 1, c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;)], function(x) ifelse(is.na(x), 0, x))
# Print the modified DataFrame
print(df)

答案4

得分: 0

这是使用dplyr的rowwise()操作和replace()的一个很好的示例。我们可以在replace()内部包含复杂的逻辑语句。
这是一个不错的方法，因为dplyr允许灵活地将该方法应用于不同的数据子集。

如果索引列D已经存在，则回答：

library(dplyr)
df |&gt; 
    rowwise() |&gt; 
    mutate(across(A:C, \(x) replace(x, is.na(x) &amp; D, 0))) |&gt;
    ungroup()
# A tibble: 4 &#215; 4
      A     B     C     D
  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
1     0     0     1     1
2    NA    NA    NA     0
3     1     0     0     1

我们还可以使用`dplyr::if_all`即时创建索引：

df |&gt; 
    rowwise() |&gt; 
    mutate(across(A:C, \(x) replace(x, is.na(x) &amp; !if_all(A:C, is.na), 0))) |&gt;
    ungroup()

英文:

This is a good case for a rowwise() operation with dplyr. We can include the complex logical statement inside replace().
This is a nice approach because dplyr allows good flexibility for applying the method to different subsets of data.

Answer if the index column D already exists:

library(dplyr)
df |&gt; 
    rowwise() |&gt; 
    mutate(across(A:C, \(x) replace(x, is.na(x) &amp; D, 0))) |&gt;
    ungroup()
# A tibble: 4 &#215; 4
      A     B     C     D
  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
1     0     0     1     1
2    NA    NA    NA     0
3     1     0     0     1

We can also create the index on-the-fly, with `dplyr::if_all`:

df |&gt; 
    rowwise() |&gt; 
    mutate(across(A:C, \(x) replace(x, is.na(x) &amp; !if_all(A:C, is.na), 0))) |&gt;
    ungroup()
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中：如何根据另一列的值来应用一个函数到一列？

问题

答案1

答案2

答案3

示例数据框

创建列D

根据列D在列A、B和C中用零替换缺失值

打印修改后的数据框

答案4

如果索引列D已经存在，则回答：

我们还可以使用`dplyr::if_all`即时创建索引：

Answer if the index column D already exists:

We can also create the index on-the-fly, with `dplyr::if_all`:

The ‘RGB_image’参数必须是一个三维数组，其中第三维等于3？

在lme函数中的split实验中的时间变量。

平滑置信区间和点估计在ggplot中

Downloading images from web and its attributes in R.

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论

问题

答案1

答案2

答案3

示例数据框

创建列D

根据列D在列A、B和C中用零替换缺失值

打印修改后的数据框

答案4

如果索引列D已经存在，则回答：

我们还可以使用dplyr::if_all即时创建索引：

Answer if the index column D already exists:

We can also create the index on-the-fly, with dplyr::if_all:

发表评论

我们还可以使用`dplyr::if_all`即时创建索引：

We can also create the index on-the-fly, with `dplyr::if_all`: