创建一个标识说话者角色的列。

huangapple go评论89阅读模式
英文:

Create column identifying speaker roles

问题

Speaker = c("A","B","A","A","C","A")

我知道大多数情况下出现的Speaker值是"Storyteller",而其他的是"Recipient"。我应该如何创建一个名为"Role"的列来标识它们呢?

  1. df <- data.frame(
  2. Speaker = c("A","B","A","A","C","A")
  3. )

到目前为止,我只知道如何计算字符值的数量:

  1. df %>%
  2. group_by(Speaker) %>%
  3. count()

期望的结果:

  1. Speaker Role
  2. 1 A Storyteller
  3. 2 B Recipient
  4. 3 A Storyteller
  5. 4 A Storyteller
  6. 5 C Recipient
  7. 6 A Storyteller
英文:

A seemingly simple question: I have a character vector of Speaker Ids. I know that the Speaker value which occurs most of the time is the "Storyteller", whereas the others are "Recipient". How can I create a column Role that identifies both "Storyteller" and "Recipient"?

  1. df &lt;- data.frame(
  2. Speaker = c(&quot;A&quot;,&quot;B&quot;,&quot;A&quot;,&quot;A&quot;,&quot;C&quot;,&quot;A&quot;)
  3. )

All I know so far is how to count the character values:

  1. df %&gt;%
  2. group_by(Speaker) %&gt;%
  3. count()

Desired result:

  1. Speaker Role
  2. 1 A Storyteller
  3. 2 B Recipient
  4. 3 A Storyteller
  5. 4 A Storyteller
  6. 5 C Recipient
  7. 6 A Storyteller

答案1

得分: 2

  1. 我相信还有更加优雅的方法,但一个基本的R方法是:

df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"发言者", "接收者")

  1. 输出:

发言者 角色
1 A 发言者
2 B 接收者
3 A 发言者
4 A 发言者
5 C 接收者
6 A 发言者

  1. <details>
  2. <summary>英文:</summary>
  3. I&#39;m sure there are more elegant ways, but a base R approach would be:

df$Role <- ifelse(df$Speaker %in% names(which.max(table(df$Speaker))),
"Speaker", "Recipient")

  1. Output:

Speaker Role
1 A Speaker
2 B Recipient
3 A Speaker
4 A Speaker
5 C Recipient
6 A Speaker

  1. </details>
  2. # 答案2
  3. **得分**: 2
  4. 我们可以使用 `add_count` 结合 `ifelse` 语句来实现:
  5. ```R
  6. library(dplyr)
  7. df %>%
  8. add_count(Speaker) %>%
  9. mutate(Role = ifelse(n == max(n), "讲故事者", "接受者"), .keep="unused")

Speaker Role
1 A 讲故事者
2 B 接受者
3 A 讲故事者
4 A 讲故事者
5 C 接受者
6 A 讲故事者

  1. <details>
  2. <summary>英文:</summary>
  3. We could use `add_count` combined with an `ifelse` statement:

library(dplyr)

df %>%
add_count(Speaker) %>%
mutate(Role = ifelse(n == max(n), "Storyteller", "Recipient"), .keep="unused")

Speaker Role
1 A Storyteller
2 B Recipient
3 A Storyteller
4 A Storyteller
5 C Recipient
6 A Storyteller

  1. </details>
  2. # 答案3
  3. **得分**: 0
  4. 以下是翻译好的部分:
  5. ```plaintext
  6. 一个简单的方法:
  7. ```plaintext
  8. s <- slice_max(count(df, Speaker), n)$Speaker
  9. df$Role <- ifelse(df$Speaker == s, "Storyteller", "Recipient")
英文:

A simple way of doing it:

  1. s &lt;- slice_max(count(df, Speaker), n)$Speaker
  2. df$Role &lt;- ifelse(df$Speaker == s, &quot;Storyteller&quot;, &quot;Recipient&quot;)

答案4

得分: 0

I think you want min, as max will guide you to C.

  1. df$role = ifelse(df$Speaker == min(df$Speaker), '讲故事者', '接受者')
  2. df
  3. Speaker role
  4. 1 A 讲故事者
  5. 2 B 接受者
  6. 3 A 讲故事者
  7. 4 A 讲故事者
  8. 5 C 接受者
  9. 6 A 讲故事者

BTW what FLIR were you using for your pupil size data?

英文:

I think you want min, as max will guide you to C.

  1. df$role = ifelse(df$Speaker == min(df$Speaker), &#39;Storyteller&#39;, &#39;Recipient&#39;)
  2. df
  3. Speaker role
  4. 1 A Storyteller
  5. 2 B Recipient
  6. 3 A Storyteller
  7. 4 A Storyteller
  8. 5 C Recipient
  9. 6 A Storyteller

BTW what FLIR were you using for your pupil size data?

答案5

得分: 0

不使用 min/max,在平局情况下标记全部。

  1. library(dplyr)
  2. df %>%
  3. add_count(Speaker) %>%
  4. mutate(Role = c("Recipient", "Storyteller")[(n == n[order(n) == n()]) + 1],
  5. n = NULL)
  6. Speaker Role
  7. 1 A Storyteller
  8. 2 B Recipient
  9. 3 A Storyteller
  10. 4 A Storyteller
  11. 5 C Recipient
  12. 6 A Storyteller
英文:

Without using min/max, in case of a draw label all.

  1. library(dplyr)
  2. df %&gt;%
  3. add_count(Speaker) %&gt;%
  4. mutate(Role = c(&quot;Recipient&quot;, &quot;Storyteller&quot;)[(n == n[order(n) == n()]) + 1],
  5. n = NULL)
  6. Speaker Role
  7. 1 A Storyteller
  8. 2 B Recipient
  9. 3 A Storyteller
  10. 4 A Storyteller
  11. 5 C Recipient
  12. 6 A Storyteller

答案6

得分: 0

  1. 使用`case_match()`的一个示例:
  2. ```R
  3. library(dplyr)
  4. df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ "讲故事者",
  5. .default = "接收者")
  6. > df
  7. Speaker Role
  8. 1 A 讲故事者
  9. 2 B 接收者
  10. 3 A 讲故事者
  11. 4 A 讲故事者
  12. 5 C 接收者
  13. 6 A 讲故事者
英文:

One example using case_match():

  1. library(dplyr)
  2. df$Role = case_match(df$Speaker, names(rev(sort(table(df)))[1]) ~ &quot;Storyteller&quot;,
  3. .default = &quot;Recipient&quot;)
  4. &gt; df
  5. Speaker Role
  6. 1 A Storyteller
  7. 2 B Recipient
  8. 3 A Storyteller
  9. 4 A Storyteller
  10. 5 C Recipient
  11. 6 A Storyteller

答案7

得分: 0

你可以这样做:

  1. with(df, c('Recipient', 'Storyteller')[(Speaker == names(which.max(table(Speaker)))) + 1L])
  2. # [1] "Storyteller" "Recipient" "Storyteller" "Storyteller" "Recipient" "Storyteller"

这比使用 ifelse 更快。

英文:

You could do,

  1. with(df, c(&#39;Recipient&#39;, &#39;Storyteller&#39;)[(Speaker == names(which.max(table(Speaker)))) + 1L])
  2. # [1] &quot;Storyteller&quot; &quot;Recipient&quot; &quot;Storyteller&quot; &quot;Storyteller&quot; &quot;Recipient&quot; &quot;Storyteller&quot;

which is faster than ifelse.

huangapple
  • 本文由 发表于 2023年7月24日 00:36:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76749308.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定