问题

以下是您要翻译的代码部分：

I have what I think should be a relatively simple problem. I have a large data set of thousands of observations taken from different areas within distinct sites, the general structure is something like:
df <- data.frame(Site = as.factor(rep(c("Site.A","Site.B","Site.C"), 5)),
                   Response = as.numeric(runif(15, 0, 10)),
                   Habitat = as.factor(c("G","G","F","G","F",
                                         "F","F","F","G","S",
                                         "S", "S", "S","S","S")))
I would like to add a column identifying the dominant habitat in each site based on the counts of the habitats in each site (i.e. whatever habitat makes of the majority of observations within each site).
This should be relatively easy with something along the lines of:
dat %>%
  group_by(Site) %>%
  mutate(Dominant_Habitat = if_else(
    (count(Habitat == "G") >= 3, "G",
    (count(Habitat == "F") >= 3, "F", "S"))

但是我无论如何都找不到使其工作的方法。谢谢。


<details>
<summary>英文:</summary>
I have what I think should be a relatively simple problem. I have a large data set of thousands of observations taken from different areas within distinct sites, the general structure is something like:

df <- data.frame(Site = as.factor(rep(c("Site.A","Site.B","Site.C"), 5)),
Response = as.numeric(runif(15, 0, 10)),
Habitat = as.factor(c("G","G","F","G","F",
"F","F","F","G","S",
"S", "S", "S","S","S")))


I would like to add a column identifying the dominant habitat in each site based on the counts of the habitats in each site (i.e. whatever habitat makes of the majority of observations within each site). 
This should be relatively easy with something along the lines of:

dat %>%
group_by(Site) %>%
mutate(Dominant_Habitat = if_else(
(count(Habitat == "G") >=3, "G",
(count(Habitat == "F") >= 3, "F", "S"))


but for the life of me can’t find a way to make it work.
Thanks,
</details>
# 答案1
**得分**: 1
你可以在每个`Site`中对最常见的`Habitat`（可能存在并列情况）进行计数和切片，然后与初始数据集进行连接。
```r
library(dplyr)
df %>%
  count(Site, Habitat) %>%
  group_by(Site) %>%
  slice_max(n) %>%
  summarise(Dominant_Habitat = paste(Habitat, collapse = '/')) %>%
  left_join(df, ., by = "Site")

结果如下：

     Site  Response Habitat Dominant_Habitat
1  Site.A 2.6751221       G              G/S
2  Site.B 7.0941244       G              F/S
3  Site.C 3.3727804       F              F/S
4  Site.A 2.4453809       G              G/S
5  Site.B 2.0155192       F              F/S
6  Site.C 6.8103549       F              F/S
7  Site.A 9.5722247       F              G/S
8  Site.B 8.7405261       F              F/S
9  Site.C 1.0035530       G              F/S
10 Site.A 4.5928348       S              G/S
11 Site.B 5.6210020       S              F/S
12 Site.C 8.2221709       S              F/S
13 Site.A 0.3368293       S              G/S
14 Site.B 0.4153831       S              F/S
15 Site.C 6.0440495       S              F/S

英文:

You can count and slice the the most frequent Habitat(maybe with ties) in each Site, and then join back to the initial dataset.

library(dplyr)
df %&gt;%
  count(Site, Habitat) %&gt;%
  group_by(Site) %&gt;%
  slice_max(n) %&gt;%
  summarise(Dominant_Habitat = paste(Habitat, collapse = &#39;/&#39;)) %&gt;%
  left_join(df, ., by = &quot;Site&quot;)
     Site  Response Habitat Dominant_Habitat
1  Site.A 2.6751221       G              G/S
2  Site.B 7.0941244       G              F/S
3  Site.C 3.3727804       F              F/S
4  Site.A 2.4453809       G              G/S
5  Site.B 2.0155192       F              F/S
6  Site.C 6.8103549       F              F/S
7  Site.A 9.5722247       F              G/S
8  Site.B 8.7405261       F              F/S
9  Site.C 1.0035530       G              F/S
10 Site.A 4.5928348       S              G/S
11 Site.B 5.6210020       S              F/S
12 Site.C 8.2221709       S              F/S
13 Site.A 0.3368293       S              G/S
14 Site.B 0.4153831       S              F/S
15 Site.C 6.0440495       S              F/S

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

基于单独级别内的计数创建条件因素。

问题

在同一图上绘制两条带置信区间的线。

如何在R terra中获取最近的NA单元格的索引的SpatRaster？

ggplot：水平对齐不同宽度的图，使用固定坐标

Loop a dataframe and check if there is the same name as another column.

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。