在R中,如何为没有相应日期的每个物种添加零或NA?

huangapple go评论68阅读模式
英文:

How can I add zeros or NAs to each species that does not have a corresponding date in R?

问题

我有一个数据库,看起来像这样(尽管它有更多的物种和日期):

日期 物种 计数
2022/06/01 AMAV 50
2022/06/05 AMAV 20
2022/06/07 SESA 10
2022/06/07 AMAV 8

如果原始数据中某个日期没有物种列出,我想要在该日期为每个物种设置计数为0。当没有列出物种时,这意味着该物种没有被检测到,因此计数应为零。

因此,对于这个示例,我想要它看起来像这样:

日期 物种 计数
2022/06/01 AMAV 50
2022/06/01 SESA 0
2022/06/05 AMAV 20
2022/06/05 SESA 0
2022/06/07 SESA 10
2022/06/07 AMAV 8

所以我想要每个日期列出所有的物种。

我想知道是否有人知道在R中如何实现这一点。我实在不知道如何解决这个问题,因此会感激任何提示!

英文:

I have a database what looks like this (although it has more species and dates):

Date Species Count
2022/06/01 AMAV 50
2022/06/05 AMAV 20
2022/06/07 SESA 10
2022/06/07 AMAV 8

I want each date to include each species at a count of 0 if there is origninally no species listed for that date. When there is no species listed, it just means that the species was not detected and therefore it should be a count of zero.

So for this example, I would want it to look like:

Date Species Count
2022/06/01 AMAV 50
2022/06/01 SESA 0
2022/06/05 AMAV 20
2022/06/05 SESA 0
2022/06/07 SESA 10
2022/06/07 AMAV 8

So I want all of my species listed for each date.

I'm wondering if anyone knows a way to accomplish this in R. I honestly have no idea how to tackle this problem, so would appreciate any tips!

答案1

得分: 0

# 所有组合
full <- expand.grid(Date = unique(df$Date), Species = unique(df$Species))
# 填充新表
merge(full, df, by = c('Date', 'Species'), all.x = TRUE)
#        Date Species Count
#1 2022/06/01    AMAV    50
#2 2022/06/01    SESA    NA
#3 2022/06/05    AMAV    20
#4 2022/06/05    SESA    NA
#5 2022/06/07    AMAV     8
#6 2022/06/07    SESA    10

之后,如果需要,你可以使用标准的 df$Count[is.na(df$Count)] <- 0 将 NA 值替换为零。

英文:

Mostly a classic problem of filling in a series with missing dates. Create data expanding all values across all dates, and left outer join to this data.

# All combinations
full &lt;- expand.grid(Date = unique(df$Date), Species = unique(df$Species))
# Filled new table
merge(full, df, by = c(&#39;Date&#39;, &#39;Species&#39;), all.x = TRUE)
#        Date Species Count
#1 2022/06/01    AMAV    50
#2 2022/06/01    SESA    NA
#3 2022/06/05    AMAV    20
#4 2022/06/05    SESA    NA
#5 2022/06/07    AMAV     8
#6 2022/06/07    SESA    10

Afterwards you can replace NA values with zeros' using the standard df$Count[is.na(df$Count)] &lt;- 0 if that is preferred.

答案2

得分: 0

以下是已翻译的代码部分:

library(dplyr)
library(tidyr)

df %>%
  mutate(x = ifelse(Species == "AMAV", "SESA", "AMAV")) %>%
  pivot_longer(c(Species, x),
               names_to = "name", 
               values_to = "Species") %>%
  mutate(Count = ifelse(name == "x", 0, Count), .keep = "unused") %>%
  arrange(Date, -Count) %>%
  group_by(Date) %>%
  slice(1:2) %>%
  ungroup()
  Date       Count Species
1 2022/06/01    50 AMAV
2 2022/06/01     0 SESA
3 2022/06/05    20 AMAV
4 2022/06/05     0 SESA
5 2022/06/07    10 SESA
6 2022/06/07     8 AMAV
英文:

A tidyverse solution:

Here we first create a helper column x,
by pivoting we get the column in shape,
then replacing x by 0 and
with arrange bringing rows in position to group and slice:

library(dplyr)
library(tidyr)

df %&gt;% 
  mutate(x = ifelse(Species==&quot;AMAV&quot;, &quot;SESA&quot;, &quot;AMAV&quot;)) %&gt;% 
  pivot_longer(c(Species, x),
               names_to=&quot;name&quot;, 
               values_to = &quot;Species&quot;) %&gt;% 
  mutate(Count = ifelse(name==&quot;x&quot;, 0, Count), .keep=&quot;unused&quot;) %&gt;% 
  arrange(Date, -Count) %&gt;%
  group_by(Date) %&gt;% 
  slice(1:2) %&gt;% 
  ungroup()
  Date       Count Species
  &lt;chr&gt;      &lt;dbl&gt; &lt;chr&gt;  
1 2022/06/01    50 AMAV   
2 2022/06/01     0 SESA   
3 2022/06/05    20 AMAV   
4 2022/06/05     0 SESA   
5 2022/06/07    10 SESA   
6 2022/06/07     8 AMAV 

huangapple
  • 本文由 发表于 2023年2月19日 04:24:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75496181.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定