英文:
Create column identifying speaker roles
问题
Speaker = c("A","B","A","A","C","A")
我知道大多数情况下出现的Speaker值是"Storyteller",而其他的是"Recipient"。我应该如何创建一个名为"Role"的列来标识它们呢?
df <- data.frame(
Speaker = c("A","B","A","A","C","A")
)
到目前为止,我只知道如何计算字符值的数量:
df %>%
group_by(Speaker) %>%
count()
期望的结果:
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
英文:
A seemingly simple question: I have a character vector of Speaker
Ids. I know that the Speaker
value which occurs most of the time is the "Storyteller", whereas the others are "Recipient". How can I create a column Role
that identifies both "Storyteller" and "Recipient"?
df <- data.frame(
Speaker = c("A","B","A","A","C","A")
)
All I know so far is how to count the character values:
df %>%
group_by(Speaker) %>%
count()
Desired result:
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
答案1
得分: 2
我相信还有更加优雅的方法,但一个基本的R方法是:
df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"发言者", "接收者")
输出:
发言者 角色
1 A 发言者
2 B 接收者
3 A 发言者
4 A 发言者
5 C 接收者
6 A 发言者
<details>
<summary>英文:</summary>
I'm sure there are more elegant ways, but a base R approach would be:
df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"Speaker", "Recipient")
Output:
Speaker Role
1 A Speaker
2 B Recipient
3 A Speaker
4 A Speaker
5 C Recipient
6 A Speaker
</details>
# 答案2
**得分**: 2
我们可以使用 `add_count` 结合 `ifelse` 语句来实现:
```R
library(dplyr)
df %>%
add_count(Speaker) %>%
mutate(Role = ifelse(n == max(n), "讲故事者", "接受者"), .keep="unused")
Speaker Role
1 A 讲故事者
2 B 接受者
3 A 讲故事者
4 A 讲故事者
5 C 接受者
6 A 讲故事者
<details>
<summary>英文:</summary>
We could use `add_count` combined with an `ifelse` statement:
library(dplyr)
df %>%
add_count(Speaker) %>%
mutate(Role = ifelse(n == max(n), "Storyteller", "Recipient"), .keep="unused")
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
</details>
# 答案3
**得分**: 0
以下是翻译好的部分:
```plaintext
一个简单的方法:
```plaintext
s <- slice_max(count(df, Speaker), n)$Speaker
df$Role <- ifelse(df$Speaker == s, "Storyteller", "Recipient")
英文:
A simple way of doing it:
s <- slice_max(count(df, Speaker), n)$Speaker
df$Role <- ifelse(df$Speaker == s, "Storyteller", "Recipient")
答案4
得分: 0
I think you want min
, as max
will guide you to C
.
df$role = ifelse(df$Speaker == min(df$Speaker), '讲故事者', '接受者')
df
Speaker role
1 A 讲故事者
2 B 接受者
3 A 讲故事者
4 A 讲故事者
5 C 接受者
6 A 讲故事者
BTW what FLIR were you using for your pupil size data?
英文:
I think you want min
, as max
will guide you to C
.
df$role = ifelse(df$Speaker == min(df$Speaker), 'Storyteller', 'Recipient')
df
Speaker role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
BTW what FLIR were you using for your pupil size data?
答案5
得分: 0
不使用 min
/max
,在平局情况下标记全部。
library(dplyr)
df %>%
add_count(Speaker) %>%
mutate(Role = c("Recipient", "Storyteller")[(n == n[order(n) == n()]) + 1],
n = NULL)
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
英文:
Without using min
/max
, in case of a draw label all.
library(dplyr)
df %>%
add_count(Speaker) %>%
mutate(Role = c("Recipient", "Storyteller")[(n == n[order(n) == n()]) + 1],
n = NULL)
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
答案6
得分: 0
使用`case_match()`的一个示例:
```R
library(dplyr)
df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ "讲故事者",
.default = "接收者")
> df
Speaker Role
1 A 讲故事者
2 B 接收者
3 A 讲故事者
4 A 讲故事者
5 C 接收者
6 A 讲故事者
英文:
One example using case_match()
:
library(dplyr)
df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ "Storyteller",
.default = "Recipient")
> df
Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller
答案7
得分: 0
你可以这样做:
with(df, c('Recipient', 'Storyteller')[(Speaker == names(which.max(table(Speaker)))) + 1L])
# [1] "Storyteller" "Recipient" "Storyteller" "Storyteller" "Recipient" "Storyteller"
这比使用 ifelse
更快。
英文:
You could do,
with(df, c('Recipient', 'Storyteller')[(Speaker == names(which.max(table(Speaker)))) + 1L])
# [1] "Storyteller" "Recipient" "Storyteller" "Storyteller" "Recipient" "Storyteller"
which is faster than ifelse
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论