2023年2月19日 05:12:48go评论87阅读模式

英文:

R | How to arrange in custom order character vectors of a df column?

问题

我有一个看起来像这样的数据框：

水果	X	Y	Z
苹果，香蕉，橙子，木瓜	a	f	k
香蕉，橙子，葡萄	b	g	l
橙子，香蕉	c	h	m
葡萄	d	i	n
香蕉，葡萄，橙子，苹果，木瓜	e	j	o

我想在每一行中设置自定义的出现顺序，如下所示：

苹果
橙子
木瓜
香蕉
葡萄

因此，该列将如下所示：

水果	X	Y	Z
苹果，橙子，木瓜，香蕉	a	f	k
橙子，香蕉，葡萄	b	g	l
橙子，香蕉	c	h	m
葡萄	d	i	n
苹果，橙子，木瓜，香蕉，葡萄	e	j	o

我该如何做？我尝试了其他帖子中的建议，但它们都是关于排列数据框行，这不是我需要的...

附言：是否有办法在管道内完成这个操作？

英文:

I have a dataframe that looks like this:

Fruit	X	Y	Z
apple, banana, orange, papaya	a	f	k
banana, orange, grape	b	g	l
orange, banana	c	h	m
grape	d	i	n
banana, grape, orange, apple, papaya	e	j	o

And I want to set a custom order of appearance in each row. Like:

Apple
Orange
Papaya
Banana
Grape

So the column would look like:

Fruit	X	Y	Z
apple, orange, papaya, banana	a	f	k
orange, banana, grape	b	g	l
orange, banana	c	h	m
grape	d	i	n
apple, orange, papaya, banana, grape	e	j	o

How can I do this??? I've tried suggestions from other posts, but they're all about arranging dataframe rows, which isn't what I need...

P.S.: is there any way to do this inside a pipe?

答案1

得分: 4

以下是您的代码的中文翻译：

首先，我们可以使用下面的代码来实现：

library(dplyr)
library(stringr)
library(purrr)
df1 <- df1 %>%
   mutate(Fruit = map_chr(strsplit(Fruit, ",\\s*"), 
        ~ toString(.x[order(match(.x,
  c("apple", "orange", "papaya", "banana", "grape")))])))

-输出

df1
                                  Fruit X Y Z
1        apple, orange, papaya, banana a f k
2                orange, banana, grape b g l
3                       orange, banana c h m
4                                grape d i n
5 apple, orange, papaya, banana, grape e j o

或者使用 separate_longer_delim：

library(tidyr)
df1 <- df1 %>%
  mutate(rn = row_number()) %>%
  separate_longer_delim(Fruit, delim = regex(",\\s*")) %>%
  arrange(rn, factor(Fruit, 
   levels = c("apple", "orange", "papaya", "banana", "grape"))) %>%
  reframe(Fruit = str_c(Fruit, collapse = ", "),
    .by = c("rn", "X", "Y", "Z")) %>%
  select(-rn) %>%
  relocate(Fruit, .before = 1)

-输出

df1
                                 Fruit X Y Z
1        apple, orange, papaya, banana a f k
2                orange, banana, grape b g l
3                       orange, banana c h m
4                                grape d i n
5 apple, orange, papaya, banana, grape e j o

如果列是 list 类型，我们可以不使用 strsplit，而是使用下面的代码：

df1 <- df1 %>%
   mutate(Fruit = map(Fruit, 
  ~ .x[order(match(.x, c("apple", "orange", "papaya", "banana", "grape")))]))

或者使用 unnest：

df1 <- df1 %>% 
  mutate(rn = row_number()) %>% 
  unnest(Fruit) %>% 
  arrange(rn, factor(Fruit, 
   levels = c("apple", "orange", "papaya", "banana", "grape"))) %>% 
  reframe(Fruit = list(Fruit),
    .by = c("rn", "X", "Y", "Z")) %>% 
  select(-rn) %>%
  relocate(Fruit, .before = 1)

-输出

df1
# A tibble: 5 × 4
  Fruit     X     Y     Z    
1 <chr [4]> a     f     k    
2 <chr [3]> b     g     l    
3 <chr [2]> c     h     m    
4 <chr [1]> d     i     n    
5 <chr [5]> e     j     o

最后，这是您的数据：

df1 <- structure(list(Fruit = c("apple, banana, orange, papaya", "banana, orange, grape", 
"orange, banana", "grape", "banana, grape, orange, apple, papaya"
), X = c("a", "b", "c", "d", "e"), Y = c("f", "g", "h", "i", 
"j"), Z = c("k", "l", "m", "n", "o")), class = "data.frame", row.names = c(NA, 
-5L))

英文:

We could do

library(dplyr)
library(stringr)
library(purrr)
df1 &lt;- df1 %&gt;%
   mutate(Fruit = map_chr(strsplit(Fruit, &quot;,\\s*&quot;), 
        ~ toString(.x[order(match(.x,
  c(&quot;apple&quot;, &quot;orange&quot;, &quot;papaya&quot;, &quot;banana&quot;, &quot;grape&quot;)))])))

-output

df1
                                  Fruit X Y Z
1        apple, orange, papaya, banana a f k
2                orange, banana, grape b g l
3                       orange, banana c h m
4                                grape d i n
5 apple, orange, papaya, banana, grape e j o

Or using separate_longer_delim

library(tidyr)
df1 &lt;- df1 %&gt;%
  mutate(rn = row_number()) %&gt;%
  separate_longer_delim(Fruit, delim = regex(&quot;,\\s*&quot;)) %&gt;% 
  arrange(rn, factor(Fruit, 
   levels = c(&quot;apple&quot;, &quot;orange&quot;, &quot;papaya&quot;, &quot;banana&quot;, &quot;grape&quot;))) %&gt;% 
  reframe(Fruit = str_c(Fruit, collapse = &quot;, &quot;),
    .by = c(&quot;rn&quot;, &quot;X&quot;, &quot;Y&quot;, &quot;Z&quot;)) %&gt;% 
  select(-rn) %&gt;%
  relocate(Fruit, .before = 1)

-output

df1
                                  Fruit X Y Z
1        apple, orange, papaya, banana a f k
2                orange, banana, grape b g l
3                       orange, banana c h m
4                                grape d i n
5 apple, orange, papaya, banana, grape e j o

If the column is list, we don't need the strsplit, instead

df1 &lt;- df1 %&gt;%
   mutate(Fruit = map(Fruit, 
  ~ .x[order(match(.x, c(&quot;apple&quot;, &quot;orange&quot;, &quot;papaya&quot;, &quot;banana&quot;, &quot;grape&quot;)))]))

Or with unnest

df1 &lt;- df1 %&gt;% 
  mutate(rn = row_number()) %&gt;% 
  unnest(Fruit) %&gt;% 
  arrange(rn, factor(Fruit, 
   levels = c(&quot;apple&quot;, &quot;orange&quot;, &quot;papaya&quot;, &quot;banana&quot;, &quot;grape&quot;))) %&gt;% 
  reframe(Fruit = list(Fruit),
    .by = c(&quot;rn&quot;, &quot;X&quot;, &quot;Y&quot;, &quot;Z&quot;)) %&gt;% 
  select(-rn) %&gt;%
  relocate(Fruit, .before = 1)

-output

df1
# A tibble: 5 &#215; 4
  Fruit     X     Y     Z    
  &lt;list&gt;    &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
1 &lt;chr [4]&gt; a     f     k    
2 &lt;chr [3]&gt; b     g     l    
3 &lt;chr [2]&gt; c     h     m    
4 &lt;chr [1]&gt; d     i     n    
5 &lt;chr [5]&gt; e     j     o

data

df1 &lt;- structure(list(Fruit = c(&quot;apple, banana, orange, papaya&quot;, &quot;banana, orange, grape&quot;, 
&quot;orange, banana&quot;, &quot;grape&quot;, &quot;banana, grape, orange, apple, papaya&quot;
), X = c(&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;d&quot;, &quot;e&quot;), Y = c(&quot;f&quot;, &quot;g&quot;, &quot;h&quot;, &quot;i&quot;, 
&quot;j&quot;), Z = c(&quot;k&quot;, &quot;l&quot;, &quot;m&quot;, &quot;n&quot;, &quot;o&quot;)), class = &quot;data.frame&quot;, row.names = c(NA, 
-5L))

答案2

得分: 3

以下是翻译好的部分：

主要特点是使用 separate_rows，然后创建具有以下级别的 factor 类：

library(dplyr)
library(tidyr)
df %>%
  group_by(group = row_number()) %>%
  separate_rows(Fruit) %>%
  mutate(Fruit= factor(Fruit, levels = c("apple", "orange", "papaya", "banana", "grape"))) %>%
  arrange(Fruit, .by_group = TRUE) %>%
  summarise(Fruit = toString(Fruit)) %>%
  bind_cols(df[2:4]) %>%
  select(-group)

  Fruit                                X     Y     Z    
  <chr>                                <chr> <chr> <chr>
1 apple, orange, papaya, banana        a     f     k    
2 orange, banana, grape                b     g     l    
3 orange, banana                       c     h     m    
4 grape                                d     i     n    
5 apple, orange, papaya, banana, grape e     j     o

英文:

Here is one more (a tidyverse solution):

Main feature is to use separate_rows and then create factor class with the levels:

library(dplyr)
library(tidyr)
df %&gt;% 
  group_by(group = row_number()) %&gt;% 
  separate_rows(Fruit) %&gt;% 
  mutate(Fruit= factor(Fruit, levels = c(&quot;apple&quot;, &quot;orange&quot;, &quot;papaya&quot;, &quot;banana&quot;, &quot;grape&quot;))) %&gt;% 
  arrange(Fruit, .by_group = TRUE) %&gt;% 
  summarise(Fruit = toString(Fruit)) %&gt;% 
  bind_cols(df[2:4]) %&gt;% 
  select(-group)

  Fruit                                X     Y     Z    
  &lt;chr&gt;                                &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
1 apple, orange, papaya, banana        a     f     k    
2 orange, banana, grape                b     g     l    
3 orange, banana                       c     h     m    
4 grape                                d     i     n    
5 apple, orange, papaya, banana, grape e     j     o

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何按自定义顺序排列数据框列中的字符向量？

问题

答案1

data

答案2

基于连续的行创建分组，以在 ggplot 折线图中显示。

文本和椭圆的颜色在rgl中不起作用。

使用等效于“match”函数来检索多个值。

如何在R中将值从xx毫米更改为仅为xx？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论