创建一个标识说话者角色的列。

huangapple go评论73阅读模式
英文:

Create column identifying speaker roles

问题

Speaker = c("A","B","A","A","C","A")

我知道大多数情况下出现的Speaker值是"Storyteller",而其他的是"Recipient"。我应该如何创建一个名为"Role"的列来标识它们呢?

df <- data.frame(
           Speaker = c("A","B","A","A","C","A")
         )

到目前为止,我只知道如何计算字符值的数量:

df %>%
    group_by(Speaker) %>%
    count()

期望的结果:

  Speaker       Role
1       A Storyteller
2       B Recipient
3       A Storyteller
4       A Storyteller
5       C Recipient
6       A Storyteller
英文:

A seemingly simple question: I have a character vector of Speaker Ids. I know that the Speaker value which occurs most of the time is the "Storyteller", whereas the others are "Recipient". How can I create a column Role that identifies both "Storyteller" and "Recipient"?

df &lt;- data.frame(
           Speaker = c(&quot;A&quot;,&quot;B&quot;,&quot;A&quot;,&quot;A&quot;,&quot;C&quot;,&quot;A&quot;)
         )

All I know so far is how to count the character values:

df %&gt;%
    group_by(Speaker) %&gt;%
    count()

Desired result:

  Speaker    Role
1       A    Storyteller
2       B    Recipient
3       A    Storyteller
4       A    Storyteller
5       C    Recipient
6       A    Storyteller

答案1

得分: 2

我相信还有更加优雅的方法,但一个基本的R方法是:

df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"发言者", "接收者")

输出:

发言者 角色
1 A 发言者
2 B 接收者
3 A 发言者
4 A 发言者
5 C 接收者
6 A 发言者


<details>
<summary>英文:</summary>

I&#39;m sure there are more elegant ways, but a base R approach would be:

df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"Speaker", "Recipient")

Output:

Speaker Role
1 A Speaker
2 B Recipient
3 A Speaker
4 A Speaker
5 C Recipient
6 A Speaker




</details>



# 答案2
**得分**: 2

我们可以使用 `add_count` 结合 `ifelse` 语句来实现:
```R
library(dplyr)

df %>%
  add_count(Speaker) %>%
  mutate(Role = ifelse(n == max(n), "讲故事者", "接受者"), .keep="unused")

Speaker Role
1 A 讲故事者
2 B 接受者
3 A 讲故事者
4 A 讲故事者
5 C 接受者
6 A 讲故事者


<details>
<summary>英文:</summary>

We could use `add_count` combined with an `ifelse` statement:

library(dplyr)

df %>%
add_count(Speaker) %>%
mutate(Role = ifelse(n == max(n), "Storyteller", "Recipient"), .keep="unused")

Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller


</details>



# 答案3
**得分**: 0

以下是翻译好的部分:

```plaintext
一个简单的方法:
```plaintext
s <- slice_max(count(df, Speaker), n)$Speaker

df$Role <- ifelse(df$Speaker == s, "Storyteller", "Recipient")
英文:

A simple way of doing it:

s &lt;- slice_max(count(df, Speaker), n)$Speaker

df$Role &lt;- ifelse(df$Speaker == s, &quot;Storyteller&quot;, &quot;Recipient&quot;)

答案4

得分: 0

I think you want min, as max will guide you to C.

df$role = ifelse(df$Speaker == min(df$Speaker), '讲故事者', '接受者')
df
  Speaker        role
1       A 讲故事者
2       B   接受者
3       A 讲故事者
4       A 讲故事者
5       C   接受者
6       A 讲故事者

BTW what FLIR were you using for your pupil size data?

英文:

I think you want min, as max will guide you to C.

df$role = ifelse(df$Speaker == min(df$Speaker), &#39;Storyteller&#39;, &#39;Recipient&#39;)
df
  Speaker        role
1       A Storyteller
2       B   Recipient
3       A Storyteller
4       A Storyteller
5       C   Recipient
6       A Storyteller

BTW what FLIR were you using for your pupil size data?

答案5

得分: 0

不使用 min/max,在平局情况下标记全部。

library(dplyr)

df %>%
  add_count(Speaker) %>%
  mutate(Role = c("Recipient", "Storyteller")[(n == n[order(n) == n()]) + 1], 
         n = NULL)
  Speaker        Role
1       A Storyteller
2       B   Recipient
3       A Storyteller
4       A Storyteller
5       C   Recipient
6       A Storyteller
英文:

Without using min/max, in case of a draw label all.

library(dplyr)

df %&gt;% 
  add_count(Speaker) %&gt;% 
  mutate(Role = c(&quot;Recipient&quot;, &quot;Storyteller&quot;)[(n == n[order(n) == n()]) + 1], 
         n = NULL)
  Speaker        Role
1       A Storyteller
2       B   Recipient
3       A Storyteller
4       A Storyteller
5       C   Recipient
6       A Storyteller

答案6

得分: 0

使用`case_match()`的一个示例:

```R
library(dplyr)

df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ "讲故事者", 
                                 .default = "接收者")

> df
  Speaker    Role
1       A 讲故事者
2       B 接收者
3       A 讲故事者
4       A 讲故事者
5       C 接收者
6       A 讲故事者
英文:

One example using case_match():

library(dplyr)

df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ &quot;Storyteller&quot;, 
                                 .default = &quot;Recipient&quot;)

&gt; df
  Speaker        Role
1       A Storyteller
2       B   Recipient
3       A Storyteller
4       A Storyteller
5       C   Recipient
6       A Storyteller

答案7

得分: 0

你可以这样做:

with(df, c('Recipient', 'Storyteller')[(Speaker == names(which.max(table(Speaker)))) + 1L])
# [1] "Storyteller" "Recipient"   "Storyteller" "Storyteller" "Recipient"   "Storyteller"

这比使用 ifelse 更快。

英文:

You could do,

with(df, c(&#39;Recipient&#39;, &#39;Storyteller&#39;)[(Speaker == names(which.max(table(Speaker)))) + 1L])
# [1] &quot;Storyteller&quot; &quot;Recipient&quot;   &quot;Storyteller&quot; &quot;Storyteller&quot; &quot;Recipient&quot;   &quot;Storyteller&quot;

which is faster than ifelse.

huangapple
  • 本文由 发表于 2023年7月24日 00:36:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76749308.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定