英文:
Count the rows by group and get the proportion of different columns
问题
location | TotalDiabetes | total CGM | proportion (total cgm/ total diabetes) |
---|---|---|---|
CA | 2 | 1 | 0.5 |
TX | 3 | 1 | 0.33 |
AZ | 3 | 2 | 0.66 |
英文:
I am trying to find the ratio of CGM prescribed at each location over number of diabetes patients. my actual data looks like this
Location | Diabetes present | CGM prescribed |
---|---|---|
CA | 1 | 1 |
TX | 1 | 0 |
TX | 1 | 1 |
CA | 1 | 0 |
AZ | 1 | 1 |
AZ | 1 | 0 |
AZ | 1 | 1 |
TX | 1 | 0 |
Desired output:
location | TotalDiabetes | total CGM | proportion (total cgm/ total diabetes) |
---|---|---|---|
CA | 2 | 1 | 0.5 |
TX | 3 | 1 | 0.33 |
AZ | 3 | 2 | 0.66 |
答案1
得分: 4
以下是翻译好的内容:
We may get the sum
of the numeric by 'Location' and then create the proportion column by dividing the Total columns
library(dplyr) # version >= 1.1.0
library(stringr)
df1 %>%
reframe(across(everything(), ~ sum(.x, na.rm = TRUE),
.names = "Total_{str_remove(.col, ' .*')}"), .by = "Location") %>%
mutate(proportion = round(Total_CGM/Total_Diabetes, 2))
-output
Location Total_Diabetes Total_CGM proportion
1 CA 2 1 0.50
2 TX 3 1 0.33
3 AZ 3 2 0.67
Or with base R
transform(aggregate(.~ Location, df1, sum),
proportion = round(`CGM prescribed`/`Diabetes present`, 2),
check.names = FALSE)
-output
Location Diabetes present CGM prescribed proportion
1 AZ 3 2 0.67
2 CA 2 1 0.50
3 TX 3 1 0.33
data
df1 <- structure(list(Location = c("CA", "TX", "TX", "CA", "AZ", "AZ", "AZ", "TX"), `Diabetes present` = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), `CGM prescribed` = c(1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA, -8L))
希望这对你有所帮助。
英文:
We may get the sum
of the numeric by 'Location' and then create the proportion column by dividing the Total columns
library(dplyr) # version >= 1.1.0
library(stringr)
df1 %>%
reframe(across(everything(), ~ sum(.x, na.rm = TRUE),
.names = "Total_{str_remove(.col, ' .*')}"), .by = "Location") %>%
mutate(proportion = round(Total_CGM/Total_Diabetes, 2))
-output
Location Total_Diabetes Total_CGM proportion
1 CA 2 1 0.50
2 TX 3 1 0.33
3 AZ 3 2 0.67
Or with base R
transform(aggregate(.~ Location, df1, sum),
proportion = round(`CGM prescribed`/`Diabetes present`, 2),
check.names = FALSE)
-output
Location Diabetes present CGM prescribed proportion
1 AZ 3 2 0.67
2 CA 2 1 0.50
3 TX 3 1 0.33
data
df1 <- structure(list(Location = c("CA", "TX", "TX", "CA", "AZ", "AZ",
"AZ", "TX"), `Diabetes present` = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), `CGM prescribed` = c(1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA,
-8L))
答案2
得分: 2
这是一个在data.table
中的解决方案。
setnames(setDT(df1)[, lapply(.SD, sum), .(Location), .SDcols = -1][,
proportion := do.call(`/`, .SD), .(Location), .SDcols = 3:2],
names(df1)[-1], paste0("Total ", sub(" .*", "", names(df1)[-1]))[]
# Location Total Diabetes Total CGM proportion
# 1: CA 2 1 0.5000000
# 2: TX 3 1 0.3333333
# 3: AZ 3 2 0.6666667
英文:
Here's a solution in data.table
.
setnames(setDT(df1)[, lapply(.SD, sum), .(Location), .SDcols = -1][,
proportion := do.call(`/`, .SD), .(Location), .SDcols = 3:2],
names(df1)[-1], paste0("Total ", sub(" .*", "", names(df1)[-1])))[]
# Location Total Diabetes Total CGM proportion
# 1: CA 2 1 0.5000000
# 2: TX 3 1 0.3333333
# 3: AZ 3 2 0.6666667
答案3
得分: 2
另一种使用 dplyr
的方法:
library(dplyr)
df %>%
mutate(Location = factor(Location, levels = c("CA", "TX", "AZ"))) %>%
group_by(Location) %>%
summarise(TotalDiabetes = sum(Diabetes_present),
Total_CGM = sum(CGM_prescribed),
Proportion = Total_CGM/TotalDiabetes)
Location TotalDiabetes Total_CGM Proportion
<fct> <int> <int> <dbl>
1 CA 2 1 0.5
2 TX 3 1 0.333
3 AZ 3 2 0.667
英文:
Another dplyr
way:
library(dplyr)
df %>%
mutate(Location = factor(Location, levels = c("CA", "TX", "AZ"))) %>%
group_by(Location) %>%
summarise(TotalDiabetes = sum(Diabetes_present),
Total_CGM = sum(CGM_prescribed),
Proportion = Total_CGM/TotalDiabetes)
Location TotalDiabetes Total_CGM Proportion
<fct> <int> <int> <dbl>
1 CA 2 1 0.5
2 TX 3 1 0.333
3 AZ 3 2 0.667
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论