在R中,如何为没有相应日期的每个物种添加零或NA?

huangapple go评论101阅读模式
英文:

How can I add zeros or NAs to each species that does not have a corresponding date in R?

问题

我有一个数据库,看起来像这样(尽管它有更多的物种和日期):

日期 物种 计数
2022/06/01 AMAV 50
2022/06/05 AMAV 20
2022/06/07 SESA 10
2022/06/07 AMAV 8

如果原始数据中某个日期没有物种列出,我想要在该日期为每个物种设置计数为0。当没有列出物种时,这意味着该物种没有被检测到,因此计数应为零。

因此,对于这个示例,我想要它看起来像这样:

日期 物种 计数
2022/06/01 AMAV 50
2022/06/01 SESA 0
2022/06/05 AMAV 20
2022/06/05 SESA 0
2022/06/07 SESA 10
2022/06/07 AMAV 8

所以我想要每个日期列出所有的物种。

我想知道是否有人知道在R中如何实现这一点。我实在不知道如何解决这个问题,因此会感激任何提示!

英文:

I have a database what looks like this (although it has more species and dates):

Date Species Count
2022/06/01 AMAV 50
2022/06/05 AMAV 20
2022/06/07 SESA 10
2022/06/07 AMAV 8

I want each date to include each species at a count of 0 if there is origninally no species listed for that date. When there is no species listed, it just means that the species was not detected and therefore it should be a count of zero.

So for this example, I would want it to look like:

Date Species Count
2022/06/01 AMAV 50
2022/06/01 SESA 0
2022/06/05 AMAV 20
2022/06/05 SESA 0
2022/06/07 SESA 10
2022/06/07 AMAV 8

So I want all of my species listed for each date.

I'm wondering if anyone knows a way to accomplish this in R. I honestly have no idea how to tackle this problem, so would appreciate any tips!

答案1

得分: 0

  1. # 所有组合
  2. full <- expand.grid(Date = unique(df$Date), Species = unique(df$Species))
  3. # 填充新表
  4. merge(full, df, by = c('Date', 'Species'), all.x = TRUE)
  5. # Date Species Count
  6. #1 2022/06/01 AMAV 50
  7. #2 2022/06/01 SESA NA
  8. #3 2022/06/05 AMAV 20
  9. #4 2022/06/05 SESA NA
  10. #5 2022/06/07 AMAV 8
  11. #6 2022/06/07 SESA 10

之后,如果需要,你可以使用标准的 df$Count[is.na(df$Count)] <- 0 将 NA 值替换为零。

英文:

Mostly a classic problem of filling in a series with missing dates. Create data expanding all values across all dates, and left outer join to this data.

  1. # All combinations
  2. full &lt;- expand.grid(Date = unique(df$Date), Species = unique(df$Species))
  3. # Filled new table
  4. merge(full, df, by = c(&#39;Date&#39;, &#39;Species&#39;), all.x = TRUE)
  5. # Date Species Count
  6. #1 2022/06/01 AMAV 50
  7. #2 2022/06/01 SESA NA
  8. #3 2022/06/05 AMAV 20
  9. #4 2022/06/05 SESA NA
  10. #5 2022/06/07 AMAV 8
  11. #6 2022/06/07 SESA 10

Afterwards you can replace NA values with zeros' using the standard df$Count[is.na(df$Count)] &lt;- 0 if that is preferred.

答案2

得分: 0

以下是已翻译的代码部分:

  1. library(dplyr)
  2. library(tidyr)
  3. df %>%
  4. mutate(x = ifelse(Species == "AMAV", "SESA", "AMAV")) %>%
  5. pivot_longer(c(Species, x),
  6. names_to = "name",
  7. values_to = "Species") %>%
  8. mutate(Count = ifelse(name == "x", 0, Count), .keep = "unused") %>%
  9. arrange(Date, -Count) %>%
  10. group_by(Date) %>%
  11. slice(1:2) %>%
  12. ungroup()
  1. Date Count Species
  2. 1 2022/06/01 50 AMAV
  3. 2 2022/06/01 0 SESA
  4. 3 2022/06/05 20 AMAV
  5. 4 2022/06/05 0 SESA
  6. 5 2022/06/07 10 SESA
  7. 6 2022/06/07 8 AMAV
英文:

A tidyverse solution:

Here we first create a helper column x,
by pivoting we get the column in shape,
then replacing x by 0 and
with arrange bringing rows in position to group and slice:

  1. library(dplyr)
  2. library(tidyr)
  3. df %&gt;%
  4. mutate(x = ifelse(Species==&quot;AMAV&quot;, &quot;SESA&quot;, &quot;AMAV&quot;)) %&gt;%
  5. pivot_longer(c(Species, x),
  6. names_to=&quot;name&quot;,
  7. values_to = &quot;Species&quot;) %&gt;%
  8. mutate(Count = ifelse(name==&quot;x&quot;, 0, Count), .keep=&quot;unused&quot;) %&gt;%
  9. arrange(Date, -Count) %&gt;%
  10. group_by(Date) %&gt;%
  11. slice(1:2) %&gt;%
  12. ungroup()
  1. Date Count Species
  2. &lt;chr&gt; &lt;dbl&gt; &lt;chr&gt;
  3. 1 2022/06/01 50 AMAV
  4. 2 2022/06/01 0 SESA
  5. 3 2022/06/05 20 AMAV
  6. 4 2022/06/05 0 SESA
  7. 5 2022/06/07 10 SESA
  8. 6 2022/06/07 8 AMAV

huangapple
  • 本文由 发表于 2023年2月19日 04:24:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75496181.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定