英文:
Fill in text in adjacent blank variables till another text appears in R
问题
我试图将文本粘贴到填充空变量的位置,直到出现另一段文本。我要为特定行执行此操作。
当前表格:
var1 | var2 | var3 | var3 | var4 |
---|---|---|---|---|
A | textA | textB | ||
B | 1 | 2 | 3 | 4 |
c | 3 | 4 | 5 | 6 |
期望输出:
var1 | var2 | var3 | var3 | var4 |
---|---|---|---|---|
A | textA | textA | textB | textB |
B | 1 | 2 | 3 | 4 |
c | 3 | 4 | 5 | 6 |
有没有一种优雅的方法来实现这个?我的当前解决方案看起来有点像下面这样,但我想使用逻辑而不是指定变量名称:
mutate(var3 = case_when(var1 == "A" & is.na(var3) ~ var2))
英文:
I'm trying to paste texts over fill in empty variables till another text appears. I want to do this for a specific row
Current table:
var1 | var2 | var3 | var3 | var4 |
---|---|---|---|---|
A | textA | textB | ||
B | 1 | 2 | 3 | 4 |
c | 3 | 4 | 5 | 6 |
Desired output:
var1 | var2 | var3 | var3 | var4 |
---|---|---|---|---|
A | textA | textA | textB | textB |
B | 1 | 2 | 3 | 4 |
c | 3 | 4 | 5 | 6 |
What's an elegant way to do this? My current solution looks something like this but I'd like to use a logic instead of specifying a variable name like below:
mutate(var3=case_when(var1=="A" & is.na(var3) ~ var2))
答案1
得分: 4
以下是代码的翻译部分:
# 使用 zoo 包提取 'var1' 为 "A" 的行,然后使用 na.locf0 函数替换NA值为前一个非NA值
library(zoo)
i1 <- df1$var1 == "A"
df1[i1,-1] <- na.locf0(unlist(df1[i1,-1]))
输出:
df1
var1 var2 var3 var3.1 var4
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
或者使用基本的 R 方法,创建一个基于非NA元素的数值索引(使用 cumsum 函数),然后使用索引复制提取行中的非NA值:
v1 <- unlist(df1[i1, -1])
df1[i1, -1] <- na.omit(v1)[cumsum(!is.na(v1))]
或者使用 tidyverse 包,将数据重塑为 'long' 格式(使用 pivot_longer 函数),然后使用 fill 函数替换NA值为前一个非NA值,最后使用 pivot_wider 函数将数据重新转换为 'wide' 格式:
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = -var1, values_transform = as.character) %>%
fill(value) %>%
pivot_wider(names_from = name, values_from = value)
如果只有交替的NA值,也可以考虑以下选项:
library(dplyover)
df1 %>%
mutate(across2(c(3, 5), c(2, 4),
~ case_match(.x, NA ~ .y, .default = as.character(.x)),
.names = "{xcol}"))
这些代码部分用于处理给定的数据框 df1 中的值和NA值。
英文:
We may extract the row where 'var1' is "A", unlist
and apply na.locf0
from zoo
to replace the NA values with the previous non-NA value
library(zoo)
i1 <- df1$var1 == "A"
df1[i1,-1] <- na.locf0(unlist(df1[i1,-1]))
-output
df1
var1 var2 var3 var3.1 var4
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
Or with base R
, create a numeric index based on the non-NA element (cumsum
) and use the index to replicate the non-NA values from the extracted row
v1 <- unlist(df1[i1, -1])
df1[i1, -1] <- na.omit(v1)[cumsum(!is.na(v1))]
Or use tidyverse
, to reshape to 'long' format (pivot_longer
), apply fill
to replace NA with previous non-NA and reshape back to wide with pivot_wider
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = -var1, values_transform = as.character) %>%
fill(value) %>%
pivot_wider(names_from = name, values_from = value)
# A tibble: 3 × 5
var1 var2 var3 var3.1 var4
<chr> <chr> <chr> <chr> <chr>
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
If there are only alternate NAs, an option is also
library(dplyover)
df1 %>%
mutate(across2(c(3, 5), c(2, 4),
~ case_match(.x, NA ~ .y, .default = as.character(.x)),
.names = "{xcol}"))
-output
var1 var2 var3 var3.1 var4
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
data
df1 <- structure(list(var1 = c("A", "B", "c"), var2 = c("textA", "1",
"3"), var3 = c(NA, 2L, 4L), var3.1 = c("textB", "3", "5"), var4 = c(NA,
4L, 6L)), class = "data.frame", row.names = c(NA, -3L))
答案2
得分: 1
以下是代码部分的翻译:
library(dplyr)
df %>%
mutate(var3 = ifelse(var3=="", var2, var3),
var4 = ifelse(var4=="", var3.1, var4))
翻译结果如下:
var1 var2 var3 var3.1 var4
<chr> <chr> <chr> <chr> <chr>
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
英文:
Here is an option, but only for a few columns:
library(dplyr)
df %>%
mutate(var3 = ifelse(var3=="", var2, var3),
var4 = ifelse(var4=="", var3.1, var4))
var1 var2 var3 var3.1 var4
<chr> <chr> <chr> <chr> <chr>
1 A textA textA textB textB
2 B 1 2 3 4
3 c 3 4 5 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论