如何在同一天内对重复项添加具有更改后ID名称的列

huangapple go评论70阅读模式
英文:

How to add column with altered ID-names for duplicates within same day

问题

我明白你的需求,你想要在数据框中添加一个名为"SampleID"的新列,以区分每个唯一样本。你可以使用以下代码来实现这一目标:

library(dplyr)

df <- df %>%
  group_by(Days, ID, Collected) %>%
  mutate(SampleID = paste(ID, row_number(), sep = "-"))

这将根据"Days"、"ID"和"Collected"分组,并使用row_number()函数创建唯一的"SampleID"。这个代码片段将为每个唯一的样本分配一个唯一的"SampleID",如你所需。

英文:

I have a data frame with Days, ID, Date collected and a count value (number of hatched eggs) for several samples each day. The ID stems from the replicate (mother) from which the sample (a number of eggs) was taken, so it requires the information from the "Date collected" column in order to distinguish them as separate samples in for instance a plot.

I want to add a new column called sampleID in which I give each unique sample its own ID.

Example data:

d1 &lt;- as.Date(&#39;2021-06-07&#39;)
d2 &lt;- as.Date(&#39;2021-06-08&#39;)
d3 &lt;- as.Date(&#39;2021-06-09&#39;)
df &lt;- data.frame(Days = c(1,1,2,2,2,2,3,3,3,3,3),
ID = c(2,5,2,2,5,9,2,2,5,5,9),
Collected =c(d1,d1,d2,d1,d1,d2,d1,d2,d1,d3,d2))

I would like an output to look like:

Days ID Collected SampleID Count
1 2 2021-06-07 2-1 3
1 5 2021-06-07 5-1 5
2 2 2021-06-08 2-1 4
2 2 2021-06-07 2-2 1
2 5 2021-06-07 5-1 7
2 9 2021-06-08 9-1 2
3 2 2021-06-07 2-1 8
3 2 2021-06-08 2-2 5
3 5 2021-06-07 5-1 7
3 5 2021-06-09 5-2 2
3 9 2021-06-08 9-1 2

and I have been trying something like:

df &lt;- df %&gt;% 
group_by(Days) %&gt;% 
mutate(ReplicateID = case_when(ID == ID &amp; Collected != Collected ~ paste(as.character(ID)+&quot;-1&quot;)))

Which doesn't work, but even if it did it would not be able to add -2 or -3 to ID's repeated more than once within the same day.. So I am kind of lost and would appreciate some help!

答案1

得分: 1

以下是代码的翻译部分:

library(dplyr)
d1 <- as.Date('2021-06-07')
d2 <- as.Date('2021-06-08')
d3 <- as.Date('2021-06-09')
df <- data.frame(Days = c(1,1,2,2,2,2,3,3,3,3,3),
                 ID = c(2,5,2,2,5,9,2,2,5,5,9),
                 Collected =c(d1,d1,d2,d1,d1,d2,d1,d2,d1,d3,d2))

df %>%
  arrange(Days,ID,Collected) %>%
  group_by(Days,ID) %>%
  mutate(SampleID = paste(ID,row_number(),sep = '-'))
英文:

Maybe something like this?

library(dplyr)
d1 &lt;- as.Date(&#39;2021-06-07&#39;)
d2 &lt;- as.Date(&#39;2021-06-08&#39;)
d3 &lt;- as.Date(&#39;2021-06-09&#39;)
df &lt;- data.frame(Days = c(1,1,2,2,2,2,3,3,3,3,3),
                 ID = c(2,5,2,2,5,9,2,2,5,5,9),
                 Collected =c(d1,d1,d2,d1,d1,d2,d1,d2,d1,d3,d2))

df |&gt;
  arrange(Days,ID,Collected) |&gt;
  group_by(Days,ID) |&gt;
  mutate(SampleID = paste(ID,row_number(),sep = &#39;-&#39;))

答案2

得分: 1

使用 ave 结合 pasteseq_along 的一种基本方法。

df$SampleID <- ave(df$ID, df$ID, df$Days, FUN = function(x) paste(x, seq_along(x), sep = "_"))

在数据框 df 中,这将创建一个名为 SampleID 的新列,包含根据 IDDays 组合生成的值,用下划线分隔。

结果如下:

#   Days ID  Collected SampleID
#1     1  2 1970-01-01      2_1
#2     1  5 1970-01-01      5_1
#3     2  2 1970-01-01      2_1
#4     2  2 1970-01-01      2_2
#5     2  5 1970-01-01      5_1
#6     2  9 1970-01-01      9_1
#7     3  2 1970-01-01      2_1
#8     3  2 1970-01-01      2_2
#9     3  5 1970-01-01      5_1
#10    3  5 1970-01-01      5_2
#11    3  9 1970-01-01      9_1

这段代码在数据框 df 中创建了一个新列 SampleID,该列基于 IDDays 列的组合生成唯一的值,用下划线分隔。

英文:

A base way using ave with paste and seq_along.

df$SampleID &lt;- ave(df$ID, df$ID, df$Days, FUN=\(x) paste(x, seq_along(x), sep=&quot;_&quot;))

df
#   Days ID  Collected SampleID
#1     1  2 1970-01-01      2_1
#2     1  5 1970-01-01      5_1
#3     2  2 1970-01-01      2_1
#4     2  2 1970-01-01      2_2
#5     2  5 1970-01-01      5_1
#6     2  9 1970-01-01      9_1
#7     3  2 1970-01-01      2_1
#8     3  2 1970-01-01      2_2
#9     3  5 1970-01-01      5_1
#10    3  5 1970-01-01      5_2
#11    3  9 1970-01-01      9_1

huangapple
  • 本文由 发表于 2023年3月20日 22:42:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75791716.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定