英文:
Calculate percentage of occurrence of a factor of the total amount of occurrences of a group factor
问题
我有一个示例数据集:
Species <- c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3", "a3", "a3", "a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden", "Mysis", "Polychaeten", "Mollusca")
df <- data.frame(Species, FishID, Prey)
为了计算Bass作为捕食者时每个FishID(个体Bass)的猎物物种的绝对百分比,有3个不同的FishID:a1、a2和a3。我想计算每个FishID(个体Bass)中猎物物种的绝对百分比。
所以在这种情况下:
Amphipods出现了3次,在所有三个个体Bass的胃中都找到,所以百分比是100%。对于Mysis也是如此。但是Polychaete只在Bass的胃中找到了两次,所以这将是66.6%。而Mollusca只在Bass的胃中找到了一次,所以是33.3%。
最终结果应该类似于这样:
Species <- c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66.6, 33.3)
df2 <- data.frame(Species, Prey, Percentage)
我尝试了以下方法:
df %>%
group_by(Species, Prey) %>%
summarise(n = n()) %>%
mutate(percent = n / sum(n) * 100)
但这不是我想要的结果。
欢迎提供任何帮助。
提前感谢!
英文:
I have a sample dataset:
Species <-c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass","Bass","Bass","Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden","Mysis","Polychaeten","Mollusca")
df <- data.frame(Species, FishID, Prey)
For having Bass as a predator, there are 3 unique individual Basses as different FishID: a1, a2 and a3. I would like to calculate the absolute percentage of occurrence of a prey species per FishID (individual Bass).
So in this case:
Amphipods occurs 3 times, so 100% in the stomachs of Bass (found in all three of the individuals), for Mysis idem. For polychaete however, is found only two times in the stomach of Bass: so this would be then 66,6%. And Moluscs are only found one time, so 33,3 %
As an end result, I am looking for something like this:
Species <-c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66,6, 33,3)
df2 <- data.frame(Species,Prey, Percentage)
I tried this:
df %>%
group_by(Species,Prey) %>%
summarise(n = n()) %>%
mutate(percent = n / sum(n) * 100)
But it isn't giving me hat I want.
Anty help is welcome.
Thank you in advance!
答案1
得分: 1
只需更改一个小地方:不要除以sum(n)
,而要除以length(unique(FishID))
,以获得正确的FishID
个体数。还请注意,FishID
的最后一个元素应该是a3
,而不是A3
。
英文:
You just have to change a little point: Instead of dividing by sum(n)
, you have to divide by length(unique(FishID))
in order to get the correct number of individual FishID
. Also note that the last element of FishID
has to be a3
, not A3
.
library(dplyr)
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
df %>%
summarise(n = n(), .by = Prey) %>%
mutate(percent = n / length(unique(FishID)) * 100)
Prey n percent
1 Amphipoden 3 100.00000
2 Mysis 3 100.00000
3 Polychaeten 2 66.66667
4 Mollusca 1 33.33333
</details>
# 答案2
**得分**: 1
```R
library(tidyverse)
df |>
# 只计算每个种类/鱼ID一次的猎物(我假设它们总是一起的)
distinct(Species, FishID, Prey) |>
mutate(count = 1) |>
# 为每个鱼ID完成缺失的组合
complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |>
summarize(percent = sum(count) / n(), .by = c(Species, Prey))
# 一个数据表:4 × 3
种类 猎物 百分比
1 鲈鱼 介形虫 1
2 鲈鱼 软体动物 0.333
3 鲈鱼 沙蚕 1
4 鲈鱼 多毛类 0.667
英文:
library(tidyverse)
df |>
# only count Prey once per Species/FishID (which I presume always go together)
distinct(Species, FishID, Prey) |>
mutate(count = 1) |>
# complete with missing combinations for each FishID
complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |>
summarize(percent = sum(count) / n(), .by = c(Species, Prey))
# A tibble: 4 × 3
Species Prey percent
<chr> <chr> <dbl>
1 Bass Amphipoden 1
2 Bass Mollusca 0.333
3 Bass Mysis 1
4 Bass Polychaeten 0.667
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论