2023年7月17日 23:48:48go评论68阅读模式

英文:

Calculate percentage of occurrence of a factor of the total amount of occurrences of a group factor

问题

我有一个示例数据集：

Species <- c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass", "Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3", "a3", "a3", "a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden", "Mysis", "Polychaeten", "Mollusca")

df <- data.frame(Species, FishID, Prey)

为了计算Bass作为捕食者时每个FishID（个体Bass）的猎物物种的绝对百分比，有3个不同的FishID：a1、a2和a3。我想计算每个FishID（个体Bass）中猎物物种的绝对百分比。

所以在这种情况下：
Amphipods出现了3次，在所有三个个体Bass的胃中都找到，所以百分比是100%。对于Mysis也是如此。但是Polychaete只在Bass的胃中找到了两次，所以这将是66.6%。而Mollusca只在Bass的胃中找到了一次，所以是33.3%。

最终结果应该类似于这样：

Species <- c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66.6, 33.3)
df2 <- data.frame(Species, Prey, Percentage)

我尝试了以下方法：

df %>%
  group_by(Species, Prey) %>%
  summarise(n = n()) %>%
  mutate(percent = n / sum(n) * 100)

但这不是我想要的结果。

欢迎提供任何帮助。

提前感谢！

英文:

I have a sample dataset:

Species &lt;-c(&quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;,&quot;Bass&quot;,&quot;Bass&quot;,&quot;Bass&quot;)
FishID &lt;- c(&quot;a1&quot;, &quot;a1&quot;, &quot;a1&quot;, &quot;a2&quot;, &quot;a2&quot;, &quot;a3&quot;,&quot;a3&quot;,&quot;a3&quot;,&quot;a3&quot;)
Prey &lt;- c(&quot;Amphipoden&quot;, &quot;Mysis&quot;, &quot;Polychaeten&quot;, &quot;Amphipoden&quot;, &quot;Mysis&quot;, &quot;Amphipoden&quot;,&quot;Mysis&quot;,&quot;Polychaeten&quot;,&quot;Mollusca&quot;)

df &lt;- data.frame(Species, FishID, Prey)

For having Bass as a predator, there are 3 unique individual Basses as different FishID: a1, a2 and a3. I would like to calculate the absolute percentage of occurrence of a prey species per FishID (individual Bass).

So in this case:
Amphipods occurs 3 times, so 100% in the stomachs of Bass (found in all three of the individuals), for Mysis idem. For polychaete however, is found only two times in the stomach of Bass: so this would be then 66,6%. And Moluscs are only found one time, so 33,3 %

As an end result, I am looking for something like this:

Species &lt;-c(&quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;, &quot;Bass&quot;)
Prey &lt;- c(&quot;Amphipoden&quot;, &quot;Mysis&quot;, &quot;Polychaeten&quot;, &quot;Mollusca&quot;)
Percentage &lt;- c(100, 100, 66,6, 33,3)
df2 &lt;- data.frame(Species,Prey, Percentage)

I tried this:

df %&gt;%
  group_by(Species,Prey) %&gt;% 
  summarise(n = n()) %&gt;%
  mutate(percent = n / sum(n) * 100)

But it isn't giving me hat I want.

Anty help is welcome.

Thank you in advance!

答案1

得分: 1

只需更改一个小地方：不要除以sum(n)，而要除以length(unique(FishID))，以获得正确的FishID个体数。还请注意，FishID的最后一个元素应该是a3，而不是A3。

英文:

You just have to change a little point: Instead of dividing by sum(n), you have to divide by length(unique(FishID)) in order to get the correct number of individual FishID. Also note that the last element of FishID has to be a3, not A3.

library(dplyr)

FishID &lt;- c(&quot;a1&quot;, &quot;a1&quot;, &quot;a1&quot;, &quot;a2&quot;, &quot;a2&quot;, &quot;a3&quot;,&quot;a3&quot;,&quot;a3&quot;,&quot;a3&quot;)

df %&gt;%
    summarise(n = n(), .by = Prey) %&gt;%
    mutate(percent = n / length(unique(FishID)) * 100)

         Prey n   percent
1  Amphipoden 3 100.00000
2       Mysis 3 100.00000
3 Polychaeten 2  66.66667
4    Mollusca 1  33.33333

</details>



# 答案2
**得分**: 1

```R
library(tidyverse)
df |&gt;
  # 只计算每个种类/鱼ID一次的猎物（我假设它们总是一起的）
  distinct(Species, FishID, Prey) |&gt;
  mutate(count = 1) |&gt;
  # 为每个鱼ID完成缺失的组合
  complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |&gt; 
  summarize(percent = sum(count) / n(), .by = c(Species, Prey))


# 一个数据表：4 × 3
  种类    猎物        百分比
1 鲈鱼    介形虫    1    
2 鲈鱼    软体动物  0.333
3 鲈鱼    沙蚕    1    
4 鲈鱼    多毛类  0.667

英文:

library(tidyverse)
df |&gt;
  # only count Prey once per Species/FishID (which I presume always go together)
  distinct(Species, FishID, Prey) |&gt;
  mutate(count = 1) |&gt;
  # complete with missing combinations for each FishID
  complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |&gt; 
  summarize(percent = sum(count) / n(), .by = c(Species, Prey))


# A tibble: 4 &#215; 3
  Species Prey        percent
  &lt;chr&gt;   &lt;chr&gt;         &lt;dbl&gt;
1 Bass    Amphipoden    1    
2 Bass    Mollusca      0.333
3 Bass    Mysis         1    
4 Bass    Polychaeten   0.667

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

计算一个组因素的总出现次数中一个因素的百分比。

问题

答案1

在R中的列表对象中的setnames()函数。

pmap_dbl 无法与 lambda 函数一起使用。

R Shiny checkboxInput 自动基于条件选择

读取一个文本文件，根据分隔符将其拆分为多行。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论