英文:
Create a column grouping strings text extracted from a column based on another column in R
问题
这是我的数据集
id text
1 "红色"
1 "蓝色"
2 "浅蓝色"
2 "红色"
2 "黄色"
3 "深绿色"
这是我想要得到的结果:
id text2
1 "红色, 蓝色"
2 "浅蓝色, 红色, 黄色"
3 "深绿色"
基本上,我需要将“text”列中的文本用逗号分隔在一起。
英文:
this is my dataset
id text
1 "red"
1 "blue"
2 "light blue"
2 "red"
2 "yellow"
3 "dark green"
this is the result I want to obtain:
id text2
1 "red, blue"
2 "light blue, red, yellow"
3 "dark green"
basically I need to put together the text from column 'text' with commas to separate the different elements
答案1
得分: 2
Using aggregate
和 toString
。
aggregate(. ~ id, d, toString)
# id text
# 1 1 red, blue
# 2 2 light blue, red, yellow
# 3 3 dark green
注意:这不适用于因子列,即如果 is.factor(d$text)
返回 TRUE
,则需要稍微不同的方法。演示:
d$text <- as.factor(d$text) # 将text列转换为因子
is.factor(d$text)
# [1] TRUE
使用以下方法:
aggregate(. ~ id, transform(d, text=as.character(text)), toString)
数据:
d <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red", "blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA, -6L), class = "data.frame")
英文:
Using aggregate
and toString
.
aggregate(. ~ id, d, toString)
# id text
# 1 1 red, blue
# 2 2 light blue, red, yellow
# 3 3 dark green
Note: This won't work with factor columns, i.e. if is.factor(d$text)
yields TRUE
you need a slightly different approach. Demonstration:
d$text <- as.factor(d$text) # make
is.factor(d$text)
# [1] TRUE
Do:
aggregate(. ~ id, transform(d, text=as.character(text)), toString)
Data:
d <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red",
"blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA,
-6L), class = "data.frame")
答案2
得分: 1
我们可以使用 dplyr
库:
library(dplyr)
df1 %>%
group_by(id) %>%
summarise(text2 = toString(text))
数据
df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red",
"blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA,
-6L), class = "data.frame")
英文:
We can use dplyr
library(dplyr)
df1 %>%
group_by(id) %>%
summarise(text2 = toString(text))
###data
df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red",
"blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA,
-6L), class = "data.frame")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论