英文:
Create count table for specific condition and then add column that creates count by group as a whole in R
问题
我有这样的表格:
data1 <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter", "Filter",
"No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds","Exceeds", NA, "NL", "Exceeds"))
我想创建一个计数表格,按 State 和 Filter 分组,然后计算 Threshold 为 "Exceeds" 的次数,并将该值放入一个新列中。然后计算某个 State 和 Filter 组合出现的次数,并将其放入另一列。我想要的示例如下:
final_data <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter",
"Filter", "No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds", "Exceeds", NA, "NL",
"Exceeds"), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))
我尝试使用 dplyr 中的 group_by 和 tally(),但无法按照我想要的方式工作。
谢谢!
英文:
I have a table like this:
data1 <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter", "Filter",
"No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds","Exceeds", NA, "NL", "Exceeds"))
I'd like to create a count table that groups by State and Filter and then counts the number of times the Threshold is "Exceeds" and put that value in a new column. And then count the number of times a certain State and Filter combination occurs and put it in another column.
An example of what I'm looking for is below.
final_data <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter",
"Filter", "No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds", "Exceeds", NA, "NL",
"Exceeds"), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))
I've tried figuring this out with group_by and tally() in dplyr, but I can't get it to work the way I want.
Thank you!!
答案1
得分: 2
你可以同时使用add_count()
来实现两个目的:
library(dplyr)
data1 %>%
group_by(State, Filter) %>%
add_count(wt = (Threshold == "Exceeds"), name = "Count_Exceeds_Threshold") %>%
add_count(name = "Count_Total") %>%
ungroup()
# # 一个 tibble: 5 × 5
# State Filter Threshold Count_Exceeds_Threshold Count_Total
# <chr> <chr> <chr> <int> <int>
# 1 NJ Filter Exceeds 2 3
# 2 NJ Filter Exceeds 2 3
# 3 PA No Filter NA 0 1
# 4 NJ Filter NL 2 3
# 5 TX Filter Exceeds 1 1
英文:
You can use add_count()
for both purposes:
library(dplyr)
data1 %>%
group_by(State, Filter) %>%
add_count(wt = (Threshold == "Exceeds"), name = "Count_Exceeds_Threshold") %>%
add_count(name = "Count_Total") %>%
ungroup()
# # A tibble: 5 × 5
# State Filter Threshold Count_Exceeds_Threshold Count_Total
# <chr> <chr> <chr> <int> <int>
# 1 NJ Filter Exceeds 2 3
# 2 NJ Filter Exceeds 2 3
# 3 PA No Filter NA 0 1
# 4 NJ Filter NL 2 3
# 5 TX Filter Exceeds 1 1
答案2
得分: 1
你可以使用mutate
和.by
进行内联分组,并计算Threshold == "Exceeds"
出现的次数。n()
用于获取每个分组的行数。
library(dplyr)
data1 %>%
mutate(Count_Exceeds_Threshold = sum(Threshold == "Exceeds", na.rm = TRUE),
Count_Total = n(), .by = c(State, Filter))
# State Filter Threshold Count_Exceeds_Threshold Count_Total
# 1 NJ Filter Exceeds 2 3
# 2 NJ Filter Exceeds 2 3
# 3 PA No Filter <NA> 0 1
# 4 NJ Filter NL 2 3
# 5 TX Filter Exceeds 1 1
英文:
You can use mutate
and .by
for inline grouping, and count the number of times Threshold == "Exceeds"
. n()
is used to get the number of rows by group.
library(dplyr)
data1 %>%
mutate(Count_Exceeds_Threshold = sum(Threshold == "Exceeds", na.rm = TRUE),
Count_Total = n(), .by = c(State, Filter))
# State Filter Threshold Count_Exceeds_Threshold Count_Total
# 1 NJ Filter Exceeds 2 3
# 2 NJ Filter Exceeds 2 3
# 3 PA No Filter <NA> 0 1
# 4 NJ Filter NL 2 3
# 5 TX Filter Exceeds 1 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论