英文:
Calculate percentage of occurrence of a factor of the total amount of occurrences of a group factor
问题
我有一个示例数据集:
Species <- c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3", "a3", "a3", "a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden", "Mysis", "Polychaeten", "Mollusca")
df <- data.frame(Species, FishID, Prey)
为了计算Bass作为捕食者时每个FishID(个体Bass)的猎物物种的绝对百分比,有3个不同的FishID:a1、a2和a3。我想计算每个FishID(个体Bass)中猎物物种的绝对百分比。
所以在这种情况下:
Amphipods出现了3次,在所有三个个体Bass的胃中都找到,所以百分比是100%。对于Mysis也是如此。但是Polychaete只在Bass的胃中找到了两次,所以这将是66.6%。而Mollusca只在Bass的胃中找到了一次,所以是33.3%。
最终结果应该类似于这样:
Species <- c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66.6, 33.3)
df2 <- data.frame(Species, Prey, Percentage)
我尝试了以下方法:
df %>%
group_by(Species, Prey) %>%
summarise(n = n()) %>%
mutate(percent = n / sum(n) * 100)
但这不是我想要的结果。
欢迎提供任何帮助。
提前感谢!
英文:
I have a sample dataset:
Species <-c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass","Bass","Bass","Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden","Mysis","Polychaeten","Mollusca")
df <- data.frame(Species, FishID, Prey)
For having Bass as a predator, there are 3 unique individual Basses as different FishID: a1, a2 and a3. I would like to calculate the absolute percentage of occurrence of a prey species per FishID (individual Bass).
So in this case:
Amphipods occurs 3 times, so 100% in the stomachs of Bass (found in all three of the individuals), for Mysis idem. For polychaete however, is found only two times in the stomach of Bass: so this would be then 66,6%. And Moluscs are only found one time, so 33,3 %
As an end result, I am looking for something like this:
Species <-c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66,6, 33,3)
df2 <- data.frame(Species,Prey, Percentage)
I tried this:
df %>%
group_by(Species,Prey) %>%
summarise(n = n()) %>%
mutate(percent = n / sum(n) * 100)
But it isn't giving me hat I want.
Anty help is welcome.
Thank you in advance!
答案1
得分: 1
只需更改一个小地方:不要除以sum(n),而要除以length(unique(FishID)),以获得正确的FishID个体数。还请注意,FishID的最后一个元素应该是a3,而不是A3。
英文:
You just have to change a little point: Instead of dividing by sum(n), you have to divide by length(unique(FishID)) in order to get the correct number of individual FishID. Also note that the last element of FishID has to be a3, not A3.
library(dplyr)
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
df %>%
summarise(n = n(), .by = Prey) %>%
mutate(percent = n / length(unique(FishID)) * 100)
Prey n percent
1 Amphipoden 3 100.00000
2 Mysis 3 100.00000
3 Polychaeten 2 66.66667
4 Mollusca 1 33.33333
</details>
# 答案2
**得分**: 1
```R
library(tidyverse)
df |>
# 只计算每个种类/鱼ID一次的猎物(我假设它们总是一起的)
distinct(Species, FishID, Prey) |>
mutate(count = 1) |>
# 为每个鱼ID完成缺失的组合
complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |>
summarize(percent = sum(count) / n(), .by = c(Species, Prey))
# 一个数据表:4 × 3
种类 猎物 百分比
1 鲈鱼 介形虫 1
2 鲈鱼 软体动物 0.333
3 鲈鱼 沙蚕 1
4 鲈鱼 多毛类 0.667
英文:
library(tidyverse)
df |>
# only count Prey once per Species/FishID (which I presume always go together)
distinct(Species, FishID, Prey) |>
mutate(count = 1) |>
# complete with missing combinations for each FishID
complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |>
summarize(percent = sum(count) / n(), .by = c(Species, Prey))
# A tibble: 4 × 3
Species Prey percent
<chr> <chr> <dbl>
1 Bass Amphipoden 1
2 Bass Mollusca 0.333
3 Bass Mysis 1
4 Bass Polychaeten 0.667
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论