创建特定条件的计数表,然后在R中添加一个列,该列按整体分组计数。

huangapple go评论104阅读模式
英文:

Create count table for specific condition and then add column that creates count by group as a whole in R

问题

我有这样的表格:

  1. data1 <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter", "Filter",
  2. "No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds","Exceeds", NA, "NL", "Exceeds"))

我想创建一个计数表格,按 State 和 Filter 分组,然后计算 Threshold 为 "Exceeds" 的次数,并将该值放入一个新列中。然后计算某个 State 和 Filter 组合出现的次数,并将其放入另一列。我想要的示例如下:

  1. final_data <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter",
  2. "Filter", "No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds", "Exceeds", NA, "NL",
  3. "Exceeds"), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))

我尝试使用 dplyr 中的 group_by 和 tally(),但无法按照我想要的方式工作。

谢谢!

英文:

I have a table like this:

  1. data1 &lt;- data.frame(&quot;State&quot; = c(&quot;NJ&quot;, &quot;NJ&quot;, &quot;PA&quot;, &quot;NJ&quot;, &quot;TX&quot;), &quot;Filter&quot; = c(&quot;Filter&quot;, &quot;Filter&quot;,
  2. &quot;No Filter&quot;, &quot;Filter&quot;, &quot;Filter&quot;), &quot;Threshold&quot; = c(&quot;Exceeds&quot;,&quot;Exceeds&quot;, NA, &quot;NL&quot;, &quot;Exceeds&quot;))

I'd like to create a count table that groups by State and Filter and then counts the number of times the Threshold is "Exceeds" and put that value in a new column. And then count the number of times a certain State and Filter combination occurs and put it in another column.
An example of what I'm looking for is below.

  1. final_data &lt;- data.frame(&quot;State&quot; = c(&quot;NJ&quot;, &quot;NJ&quot;, &quot;PA&quot;, &quot;NJ&quot;, &quot;TX&quot;), &quot;Filter&quot; = c(&quot;Filter&quot;,
  2. &quot;Filter&quot;, &quot;No Filter&quot;, &quot;Filter&quot;, &quot;Filter&quot;), &quot;Threshold&quot; = c(&quot;Exceeds&quot;, &quot;Exceeds&quot;, NA, &quot;NL&quot;,
  3. &quot;Exceeds&quot;), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))

I've tried figuring this out with group_by and tally() in dplyr, but I can't get it to work the way I want.

Thank you!!

答案1

得分: 2

你可以同时使用add_count()来实现两个目的:

  1. library(dplyr)
  2. data1 %>%
  3. group_by(State, Filter) %>%
  4. add_count(wt = (Threshold == "Exceeds"), name = "Count_Exceeds_Threshold") %>%
  5. add_count(name = "Count_Total") %>%
  6. ungroup()
  7. # # 一个 tibble: 5 × 5
  8. # State Filter Threshold Count_Exceeds_Threshold Count_Total
  9. # <chr> <chr> <chr> <int> <int>
  10. # 1 NJ Filter Exceeds 2 3
  11. # 2 NJ Filter Exceeds 2 3
  12. # 3 PA No Filter NA 0 1
  13. # 4 NJ Filter NL 2 3
  14. # 5 TX Filter Exceeds 1 1
英文:

You can use add_count() for both purposes:

  1. library(dplyr)
  2. data1 %&gt;%
  3. group_by(State, Filter) %&gt;%
  4. add_count(wt = (Threshold == &quot;Exceeds&quot;), name = &quot;Count_Exceeds_Threshold&quot;) %&gt;%
  5. add_count(name = &quot;Count_Total&quot;) %&gt;%
  6. ungroup()
  7. # # A tibble: 5 &#215; 5
  8. # State Filter Threshold Count_Exceeds_Threshold Count_Total
  9. # &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt;
  10. # 1 NJ Filter Exceeds 2 3
  11. # 2 NJ Filter Exceeds 2 3
  12. # 3 PA No Filter NA 0 1
  13. # 4 NJ Filter NL 2 3
  14. # 5 TX Filter Exceeds 1 1

答案2

得分: 1

你可以使用mutate.by进行内联分组,并计算Threshold == "Exceeds"出现的次数。n()用于获取每个分组的行数。

  1. library(dplyr)
  2. data1 %>%
  3. mutate(Count_Exceeds_Threshold = sum(Threshold == "Exceeds", na.rm = TRUE),
  4. Count_Total = n(), .by = c(State, Filter))
  5. # State Filter Threshold Count_Exceeds_Threshold Count_Total
  6. # 1 NJ Filter Exceeds 2 3
  7. # 2 NJ Filter Exceeds 2 3
  8. # 3 PA No Filter <NA> 0 1
  9. # 4 NJ Filter NL 2 3
  10. # 5 TX Filter Exceeds 1 1
英文:

You can use mutate and .by for inline grouping, and count the number of times Threshold == &quot;Exceeds&quot;. n() is used to get the number of rows by group.

  1. library(dplyr)
  2. data1 %&gt;%
  3. mutate(Count_Exceeds_Threshold = sum(Threshold == &quot;Exceeds&quot;, na.rm = TRUE),
  4. Count_Total = n(), .by = c(State, Filter))
  5. # State Filter Threshold Count_Exceeds_Threshold Count_Total
  6. # 1 NJ Filter Exceeds 2 3
  7. # 2 NJ Filter Exceeds 2 3
  8. # 3 PA No Filter &lt;NA&gt; 0 1
  9. # 4 NJ Filter NL 2 3
  10. # 5 TX Filter Exceeds 1 1

huangapple
  • 本文由 发表于 2023年7月27日 19:35:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76779334.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定