用下一列的变量替换数据框中的NA值(R)

huangapple go评论73阅读模式
英文:

Replace NA values in dataframe with variables in the next column (R)

问题

我是新手,还在努力学习R语言,但在其他帖子中找不到我要找的答案。

我有一个数据集,简化起见,有5列。第1、2和4列总是有值的,但在某些行中,第3列没有值。以下是一个示例:

当前数据集:

    A  B  C  D  E
    1  1  2  3   
    1  2  NA 4  5
    1  2  3  4   
    1  3  NA 9  7
    1  2  NA 5  6

我想要将NA值替换为第D列的值,然后将第E列的值移到D列,以此类推。

期望的输出:

    A  B  C  D  E
    1  1  2  3  NA
    1  2  4  5  NA
    1  2  3  4  NA
    1  3  9  7  NA
    1  2  5  6  NA

我复制了不同的Stack Overflow帖子中的代码,但都没有达到我想要的效果。

na.omit会删除行。非常感谢任何帮助。

英文:

I am new still trying to learn R and I could not find the answers I am looking for in any other thread.

I have a dataset with (for simplicity) 5 columns. Columns 1,2, and4 always have values, but in some rows column 3 doesn't. Below is an example:

Current

A  B  C  D  E
1  1  2  3 
1  2  NA 4  5
1  2  3  4 
1  3  NA 9  7
1  2  NA 5  6

I want to make it so that the NA's are replaced by the value in column D, and then the value in col E is shifted to D, etc.

Desired output:

A  B  C  D  E
1  1  2  3  NA
1  2  4  5  NA
1  2  3  4  NA
1  3  9  7  NA
1  2  5  6  NA

I copied what was on different Stack overflow threads and none achieved what I wanted.

na.omit gets rid of the row. Any help is greatly appreciated.

答案1

得分: 1

以下是翻译好的内容:

## 数据

data <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))

## 代码

library(dplyr)

data %>% 
  mutate(
    aux = C,
    C = if_else(is.na(aux), D, C),
    D = if_else(is.na(aux), E, D),
    E = NA
  ) %>% 
  select(-aux)

## 输出

  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

希望这有帮助!

英文:

Data

data &lt;- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = &quot;data.frame&quot;, row.names = c(NA, -5L
))

Code

library(dplyr)

data %&gt;% 
  mutate(
    aux = C,
    C = if_else(is.na(aux),D,C),
    D = if_else(is.na(aux),E,D),
    E = NA
  ) %&gt;% 
  select(-aux)

Output

  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

答案2

得分: 1

替换操作一次完成:

dat[is.na(dat$C), c("C","D","E")] <- c(dat[is.na(dat$C), c("D","E")], NA)
dat
#  A B C D  E
#1 1 1 2 3 NA
#2 1 2 4 5 NA
#3 1 2 3 4 NA
#4 1 3 9 7 NA
#5 1 2 5 6 NA

其中 dat 是:

dat <- read.table(text="A  B  C  D  E
1  1  2  3 
1  2  NA 4  5
1  2  3  4 
1  3  NA 9  7
1  2  NA 5  6", fill=TRUE, header=TRUE)
英文:

Replacement operation all in one go:

dat[is.na(dat$C), c(&quot;C&quot;,&quot;D&quot;,&quot;E&quot;)] &lt;- c(dat[is.na(dat$C), c(&quot;D&quot;,&quot;E&quot;)], NA)
dat
#  A B C D  E
#1 1 1 2 3 NA
#2 1 2 4 5 NA
#3 1 2 3 4 NA
#4 1 3 9 7 NA
#5 1 2 5 6 NA

Where dat was:

dat &lt;- read.table(text=&quot;A  B  C  D  E
1  1  2  3 
1  2  NA 4  5
1  2  3  4 
1  3  NA 9  7
1  2  NA 5  6&quot;, fill=TRUE, header=TRUE)

答案3

得分: 1

使用 shift_row_values

library(hacksaw)
shift_row_values(df1)
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

数据

df1 <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))
英文:

Using shift_row_values

library(hacksaw)
shift_row_values(df1)
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

data

df1 &lt;- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = &quot;data.frame&quot;, row.names = c(NA, -5L
))

答案4

得分: 0

以下是您要的翻译内容:

A base R universal approach using order without prior knowledge of NA positions.

setNames(data.frame(t(apply(data, 1, function(x) 
  x[order(is.na(x))]))), colnames(data))
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

使用 dplyr

library(dplyr)

t(data) %>%
  data.frame() %>%
  mutate(across(everything(), ~ .x[order(is.na(.x))])) %>%
  t() %>%
  as_tibble()
# A tibble: 5 × 5
      A     B     C     D     E
  <int> <int> <int> <int> <int>
1     1     1     2     3    NA
2     1     2     4     5    NA
3     1     2     3     4    NA
4     1     3     9     7    NA
5     1     2     5     6    NA

数据

data <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))
英文:

A base R universal approach using order without prior knowledge of NA positions.

setNames(data.frame(t(apply(data, 1, function(x) 
  x[order(is.na(x))]))), colnames(data))
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

Using dplyr

library(dplyr)

t(data) %&gt;% 
  data.frame() %&gt;% 
  mutate(across(everything(), ~ .x[order(is.na(.x))])) %&gt;% 
  t() %&gt;% 
  as_tibble()
# A tibble: 5 &#215; 5
      A     B     C     D     E
  &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1     1     1     2     3    NA
2     1     2     4     5    NA
3     1     2     3     4    NA
4     1     3     9     7    NA
5     1     2     5     6    NA

Data

data &lt;- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = &quot;data.frame&quot;, row.names = c(NA, -5L
))

huangapple
  • 本文由 发表于 2023年1月9日 07:39:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/75052066.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定