分割一个数据框并将其重新拼合在一起。

huangapple go评论63阅读模式
英文:

Split a dataframe and past it back together

问题

我有一个数据框(如下所示的片段),我想要将单元格拆分,以便我可以为它们分配其他信息,然后将它们粘贴在一起。我遇到的问题是如何将它们拆分但保持每一行在一起。这是我所拥有的以及我试图做的一个示例:

当前数据框
newcolumn      code        name           place
NA/NA          121/102      John/James    GBR/GBR
NA/NA          100/103      Harry/Peter   GBR/GBR
NA/NA          113/111      Will/Jamie    GBR/GBR
NA/NA          109/112      Brian/Steve   GBR/GBR

现在,我想将这个数据框分割成类似这样的内容:

newcolumn  code     name    place
NA        121      John    GBR
NA        102      James   GBR
NA        100      Harry   GBR
NA        103      Peter   GBR
NA        113      Will    GBR
NA        111      Jamie   GBR
NA        109      Brian   GBR
NA        112      Steve   GBR

然后,在我填写了新列之后,我想能够将它们再次粘在一起(也许使用循环?),但这将使用第1和2行,第3和4行,依此类推。

英文:

I have a dataframe (snipet shown below) to which I want to split the cells so I can assign other information to them and then paste them back together. The issue im having is splitting them up but keeping each row together if that makes sense. Here an example of what i have and what im trying to do;

current df
newcolumn      code        name           place
NA/NA          121/102      John/James    GBR/GBR
NA/NA          100/103      Harry/Peter   GBR/GBR
NA/NA          113/111      Will/Jamie   GBR/GBR
NA/NA          109/112      Brian/Steve    GBR/GBR

I now wish to seperate this df to something like this;

   newcolumn  code     name    place
    NA        121      John    GBR
    NA        101      James   GBR
    NA        100      Harry   GBR
    NA        103      Peter   GBR
    NA        113      Will    GBR
    NA        111      Jamie   GBR
    NA        109      Brian   GBR
    NA        112      Steve   GBR

Then after I have filled in my newcolumn I want to be able to past back together (maybe using a loop?) but this will be using row 1 and 2, 3 and 4 and so on

答案1

得分: 1

以下是您要翻译的内容:

第一部分:

A combination of `strsplit()` and `lapply()` gives the desired result for the 1st part:
df <- data.frame(
  newcolumn = rep("NA/NA", 4),
  code = c("121/102", "100/103", "113/111", "109/112"),
  name = c("John/James", "Harry/Peter", "Will/Jamie",
           "Brian/Steve"),
  place = c("GBR/GBR", "GBR/GBR", "GBR/GBR", "GBR/GBR")
)

df
#>   newcolumn    code        name   place
#> 1     NA/NA 121/102  John/James GBR/GBR
#> 2     NA/NA 100/103 Harry/Peter GBR/GBR
#> 3     NA/NA 113/111  Will/Jamie GBR/GBR
#> 4     NA/NA 109/112 Brian/Steve GBR/GBR

df_split <-
  as.data.frame(lapply(df, function(x) {
    unlist(strsplit(x, split = "/", fixed = TRUE))
  }))

df_split
#>   newcolumn code  name place
#> 1        NA  121  John   GBR
#> 2        NA  102 James   GBR
#> 3        NA  100 Harry   GBR
#> 4        NA  103 Peter   GBR
#> 5        NA  113  Will   GBR
#> 6        NA  111 Jamie   GBR
#> 7        NA  109 Brian   GBR
#> 8        NA  112 Steve   GBR

第二部分,结合使用 mapply()paste() 和选择交替行的 seq() 是一种选项:

df_split$newcolumn <- letters[seq_len(nrow(df_split))]

df_new <- mapply(paste,
                 df_split[seq(from = 1, to = nrow(df_split), by = 2), ],
                 df_split[seq(from = 2, to = nrow(df_split), by = 2), ],
                 SIMPLIFY = FALSE,
                 MoreArgs = list(sep = "/"))
df_new <- as.data.frame(df_new)

df_new
#>   newcolumn    code        name   place
#> 1       a/b 121/102  John/James GBR/GBR
#> 2       c/d 100/103 Harry/Peter GBR/GBR
#> 3       e/f 113/111  Will/Jamie GBR/GBR
#> 4       g/h 109/112 Brian/Steve GBR/GBR

<sup>Created on 2023-06-06 with reprex v2.0.2</sup>

英文:

A combination of strsplit() and lapply() gives the desired result for the 1st part:

df &lt;- data.frame(
  newcolumn = rep(&quot;NA/NA&quot;, 4),
  code = c(&quot;121/102&quot;, &quot;100/103&quot;, &quot;113/111&quot;, &quot;109/112&quot;),
  name = c(&quot;John/James&quot;, &quot;Harry/Peter&quot;, &quot;Will/Jamie&quot;,
           &quot;Brian/Steve&quot;),
  place = c(&quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;, &quot;GBR/GBR&quot;)
)

df
#&gt;   newcolumn    code        name   place
#&gt; 1     NA/NA 121/102  John/James GBR/GBR
#&gt; 2     NA/NA 100/103 Harry/Peter GBR/GBR
#&gt; 3     NA/NA 113/111  Will/Jamie GBR/GBR
#&gt; 4     NA/NA 109/112 Brian/Steve GBR/GBR

df_split &lt;-
  as.data.frame(lapply(df, function(x) {
    unlist(strsplit(x, split = &quot;/&quot;, fixed = TRUE))
  }))

df_split
#&gt;   newcolumn code  name place
#&gt; 1        NA  121  John   GBR
#&gt; 2        NA  102 James   GBR
#&gt; 3        NA  100 Harry   GBR
#&gt; 4        NA  103 Peter   GBR
#&gt; 5        NA  113  Will   GBR
#&gt; 6        NA  111 Jamie   GBR
#&gt; 7        NA  109 Brian   GBR
#&gt; 8        NA  112 Steve   GBR

For the second part, a combination of mapply(), paste() and selecting alternating rows with seq() is one option:

df_split$newcolumn &lt;- letters[seq_len(nrow(df_split))]

df_new &lt;- mapply(paste,
                 df_split[seq(from = 1, to = nrow(df_split), by = 2), ],
                 df_split[seq(from = 2, to = nrow(df_split), by = 2), ],
                 SIMPLIFY = FALSE,
                 MoreArgs = list(sep = &quot;/&quot;))
df_new &lt;- as.data.frame(df_new)

df_new
#&gt;   newcolumn    code        name   place
#&gt; 1       a/b 121/102  John/James GBR/GBR
#&gt; 2       c/d 100/103 Harry/Peter GBR/GBR
#&gt; 3       e/f 113/111  Will/Jamie GBR/GBR
#&gt; 4       g/h 109/112 Brian/Steve GBR/GBR

<sup>Created on 2023-06-06 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月6日 17:28:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76413235.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定