分割一个数据框并将其重新拼合在一起。

huangapple go评论86阅读模式
英文:

Split a dataframe and past it back together

问题

我有一个数据框(如下所示的片段),我想要将单元格拆分,以便我可以为它们分配其他信息,然后将它们粘贴在一起。我遇到的问题是如何将它们拆分但保持每一行在一起。这是我所拥有的以及我试图做的一个示例:

  1. 当前数据框
  2. newcolumn code name place
  3. NA/NA 121/102 John/James GBR/GBR
  4. NA/NA 100/103 Harry/Peter GBR/GBR
  5. NA/NA 113/111 Will/Jamie GBR/GBR
  6. NA/NA 109/112 Brian/Steve GBR/GBR

现在,我想将这个数据框分割成类似这样的内容:

  1. newcolumn code name place
  2. NA 121 John GBR
  3. NA 102 James GBR
  4. NA 100 Harry GBR
  5. NA 103 Peter GBR
  6. NA 113 Will GBR
  7. NA 111 Jamie GBR
  8. NA 109 Brian GBR
  9. NA 112 Steve GBR

然后,在我填写了新列之后,我想能够将它们再次粘在一起(也许使用循环?),但这将使用第1和2行,第3和4行,依此类推。

英文:

I have a dataframe (snipet shown below) to which I want to split the cells so I can assign other information to them and then paste them back together. The issue im having is splitting them up but keeping each row together if that makes sense. Here an example of what i have and what im trying to do;

  1. current df
  2. newcolumn code name place
  3. NA/NA 121/102 John/James GBR/GBR
  4. NA/NA 100/103 Harry/Peter GBR/GBR
  5. NA/NA 113/111 Will/Jamie GBR/GBR
  6. NA/NA 109/112 Brian/Steve GBR/GBR

I now wish to seperate this df to something like this;

  1. newcolumn code name place
  2. NA 121 John GBR
  3. NA 101 James GBR
  4. NA 100 Harry GBR
  5. NA 103 Peter GBR
  6. NA 113 Will GBR
  7. NA 111 Jamie GBR
  8. NA 109 Brian GBR
  9. NA 112 Steve GBR

Then after I have filled in my newcolumn I want to be able to past back together (maybe using a loop?) but this will be using row 1 and 2, 3 and 4 and so on

答案1

得分: 1

以下是您要翻译的内容:

第一部分:

  1. A combination of `strsplit()` and `lapply()` gives the desired result for the 1st part:
  1. df <- data.frame(
  2. newcolumn = rep("NA/NA", 4),
  3. code = c("121/102", "100/103", "113/111", "109/112"),
  4. name = c("John/James", "Harry/Peter", "Will/Jamie",
  5. "Brian/Steve"),
  6. place = c("GBR/GBR", "GBR/GBR", "GBR/GBR", "GBR/GBR")
  7. )
  8. df
  9. #> newcolumn code name place
  10. #> 1 NA/NA 121/102 John/James GBR/GBR
  11. #> 2 NA/NA 100/103 Harry/Peter GBR/GBR
  12. #> 3 NA/NA 113/111 Will/Jamie GBR/GBR
  13. #> 4 NA/NA 109/112 Brian/Steve GBR/GBR
  14. df_split <-
  15. as.data.frame(lapply(df, function(x) {
  16. unlist(strsplit(x, split = "/", fixed = TRUE))
  17. }))
  18. df_split
  19. #> newcolumn code name place
  20. #> 1 NA 121 John GBR
  21. #> 2 NA 102 James GBR
  22. #> 3 NA 100 Harry GBR
  23. #> 4 NA 103 Peter GBR
  24. #> 5 NA 113 Will GBR
  25. #> 6 NA 111 Jamie GBR
  26. #> 7 NA 109 Brian GBR
  27. #> 8 NA 112 Steve GBR

第二部分,结合使用 mapply()paste() 和选择交替行的 seq() 是一种选项:

  1. df_split$newcolumn <- letters[seq_len(nrow(df_split))]
  2. df_new <- mapply(paste,
  3. df_split[seq(from = 1, to = nrow(df_split), by = 2), ],
  4. df_split[seq(from = 2, to = nrow(df_split), by = 2), ],
  5. SIMPLIFY = FALSE,
  6. MoreArgs = list(sep = "/"))
  7. df_new <- as.data.frame(df_new)
  8. df_new
  9. #> newcolumn code name place
  10. #> 1 a/b 121/102 John/James GBR/GBR
  11. #> 2 c/d 100/103 Harry/Peter GBR/GBR
  12. #> 3 e/f 113/111 Will/Jamie GBR/GBR
  13. #> 4 g/h 109/112 Brian/Steve GBR/GBR

<sup>Created on 2023-06-06 with reprex v2.0.2</sup>

英文:

A combination of strsplit() and lapply() gives the desired result for the 1st part:

  1. df &lt;- data.frame(
  2. newcolumn = rep(&quot;NA/NA&quot;, 4),
  3. code = c(&quot;121/102&quot;, &quot;100/103&quot;, &quot;113/111&quot;, &quot;109/112&quot;),
  4. name = c(&quot;John/James&quot;, &quot;Harry/Peter&quot;, &quot;Will/Jamie&quot;,
  5. &quot;Brian/Steve&quot;),
  6. place = c(&quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;)
  7. )
  8. df
  9. #&gt; newcolumn code name place
  10. #&gt; 1 NA/NA 121/102 John/James GBR/GBR
  11. #&gt; 2 NA/NA 100/103 Harry/Peter GBR/GBR
  12. #&gt; 3 NA/NA 113/111 Will/Jamie GBR/GBR
  13. #&gt; 4 NA/NA 109/112 Brian/Steve GBR/GBR
  14. df_split &lt;-
  15. as.data.frame(lapply(df, function(x) {
  16. unlist(strsplit(x, split = &quot;/&quot;, fixed = TRUE))
  17. }))
  18. df_split
  19. #&gt; newcolumn code name place
  20. #&gt; 1 NA 121 John GBR
  21. #&gt; 2 NA 102 James GBR
  22. #&gt; 3 NA 100 Harry GBR
  23. #&gt; 4 NA 103 Peter GBR
  24. #&gt; 5 NA 113 Will GBR
  25. #&gt; 6 NA 111 Jamie GBR
  26. #&gt; 7 NA 109 Brian GBR
  27. #&gt; 8 NA 112 Steve GBR

For the second part, a combination of mapply(), paste() and selecting alternating rows with seq() is one option:

  1. df_split$newcolumn &lt;- letters[seq_len(nrow(df_split))]
  2. df_new &lt;- mapply(paste,
  3. df_split[seq(from = 1, to = nrow(df_split), by = 2), ],
  4. df_split[seq(from = 2, to = nrow(df_split), by = 2), ],
  5. SIMPLIFY = FALSE,
  6. MoreArgs = list(sep = &quot;/&quot;))
  7. df_new &lt;- as.data.frame(df_new)
  8. df_new
  9. #&gt; newcolumn code name place
  10. #&gt; 1 a/b 121/102 John/James GBR/GBR
  11. #&gt; 2 c/d 100/103 Harry/Peter GBR/GBR
  12. #&gt; 3 e/f 113/111 Will/Jamie GBR/GBR
  13. #&gt; 4 g/h 109/112 Brian/Steve GBR/GBR

<sup>Created on 2023-06-06 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月6日 17:28:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76413235.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定