在R中根据列中的一个值填充数据框。

huangapple go评论152阅读模式
英文:

Fill across a data frame based on a value in a column in R

问题

我想要填充在y中指定的列数中m0中的值。期望的结果是:

y m0 m1 m2 mn
1 5 5 NA NA
2 15 15 15 NA
3 25 25 25 25

感谢任何指导!

英文:

I have a data frame as below:

y m0 m1 m2 mn
1 5 NA NA NA
2 15 NA NA NA
3 25 NA NA NA

I would like to fill the value in m0 across the number of columns specified in y. Desired result is:

y m0 m1 m2 mn
1 5 5 NA NA
2 15 15 15 NA
3 25 25 25 25

Appreciate any guidance!

答案1

得分: 1

你可以首先将你的数据转换成“长”格式,然后在NA的计数在y内时,用first值替换NA

  1. library(tidyverse)
  2. df %>%
  3. pivot_longer(-y) %>%
  4. mutate(value = ifelse(is.na(value) & cumsum(is.na(value)) <= y, first(value), value),
  5. .by = y) %>%
  6. pivot_wider()
  7. #> # A tibble: 3 × 5
  8. #> y m0 m1 m2 mn
  9. #> <int> <int> <int> <int> <int>
  10. #> 1 1 5 5 NA NA
  11. #> 2 2 15 15 15 NA
  12. #> 3 3 25 25 25 25

请注意,这是R语言中的代码示例,用于将数据从宽格式转换为长格式,然后在特定条件下替换缺失值。

英文:

You can first transform your data to a "long" format, then replace NA with the first value when the count of NA is within y.

  1. library(tidyverse)
  2. df %&gt;%
  3. pivot_longer(-y) %&gt;%
  4. mutate(value = ifelse(is.na(value) &amp; cumsum(is.na(value)) &lt;= y, first(value), value),
  5. .by = y) %&gt;%
  6. pivot_wider()
  7. #&gt; # A tibble: 3 &#215; 5
  8. #&gt; y m0 m1 m2 mn
  9. #&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
  10. #&gt; 1 1 5 5 NA NA
  11. #&gt; 2 2 15 15 15 NA
  12. #&gt; 3 3 25 25 25 25

答案2

得分: 1

A 基本 R 方法使用 sapplycolbeg 定义了第一个需要替换 NA 的列。可以硬编码,但我怀疑它应该可调整为真实数据。

  1. colbeg <- 3
  2. data.frame(t(sapply(seq_along(df$y), \(x){
  3. df[x,colbeg:(colbeg - 1 + df$y[x])] <- df$m0[x]; df[x,]})))
  4. y m0 m1 m2 mn
  5. 1 1 5 5 NA NA
  6. 2 2 15 15 15 NA
  7. 3 3 25 25 25 25
英文:

A base R approach using sapply. colbeg defines the first column where NA has to be replaced. Can be hard coded but I suspect that it should be adjustable for the real data.

  1. colbeg &lt;- 3
  2. data.frame(t(sapply(seq_along(df$y), \(x){
  3. df[x,colbeg:(colbeg - 1 + df$y[x])] &lt;- df$m0[x]; df[x,]})))
  4. y m0 m1 m2 mn
  5. 1 1 5 5 NA NA
  6. 2 2 15 15 15 NA
  7. 3 3 25 25 25 25

答案3

得分: 0

  1. Base R:
  2. ```r
  3. lapply(setNames(0:2, paste0("m", 0:2 + 1)), function(z) if (z < 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z))) |
  4. as.data.frame()
  5. # m1 m2 m3
  6. # 1 5 NA NA
  7. # 2 15 15 NA
  8. # 3 25 25 25

You can either replace the columns or cbind them.

  1. quux[,0:2 + 3] <- lapply(setNames(0:2, paste0("m", 0:2 + 1)), function(z) if (z < 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z))) |
  2. as.data.frame()
  3. quux <- cbind(
  4. quux[,1:2],
  5. lapply(setNames(0:2, paste0("m", 0:2 + 1)), function(z) if (z < 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z)))
  6. )

(I'm consistently using 0:2 and offsets to show how things would be changed for an arbitrary number of columns.)

  1. <details>
  2. <summary>英文:</summary>
  3. Base R:
  4. ```r
  5. lapply(setNames(0:2, paste0(&quot;m&quot;, 0:2 + 1)), function(z) if (z &lt; 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z))) |&gt;
  6. as.data.frame()
  7. # m1 m2 m3
  8. # 1 5 NA NA
  9. # 2 15 15 NA
  10. # 3 25 25 25

You can either replace the columns or cbind them.

  1. quux[,0:2 + 3] &lt;- lapply(setNames(0:2, paste0(&quot;m&quot;, 0:2 + 1)), function(z) if (z &lt; 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z))) |&gt;
  2. as.data.frame()
  3. quux &lt;- cbind(
  4. quux[,1:2],
  5. lapply(setNames(0:2, paste0(&quot;m&quot;, 0:2 + 1)), function(z) if (z &lt; 1) quux$m0 else c(rep(NA, z), tail(quux$m0, n = -z)))
  6. )

(I'm consistently using 0:2 and offsets to show how things would be changed for an arbitrary number of columns.)

huangapple
  • 本文由 发表于 2023年6月22日 20:41:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76532031.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定