在R中为具有多个测量的参与者保留一个体重测量

huangapple go评论65阅读模式
英文:

Keeping one weight measurement for participants with multiple measurements in R

问题

所以我正在努力找出解决这个问题的最佳方法。我正在处理一个纵向数据集,每个参与者在同一天有多个体重测量。我想要做的是只保留每个参与者在那一天的第一次观测(测量)。我正在使用R。
这是数据的样本。

ID     Date    Weight
1     2/1     160
1     2/1     159
1     2/1     160.5
2     2/1     200
2     2/1     198
2     2/1     201

我还不确定如何处理这个问题。我的期望是使数据集看起来像这样(只保留第一次观测)。

ID     Date    Weight
1     2/1     160

2     2/1     200
英文:

So I am trying to figure out the best way to deal with this issue. I am working with a longitudinal dataset that has multiple weight measurements for each participant on the same day. What I want to do is to only keep the first observation (measurement) for each participant on that day. I am using R.
This is an example of how the data looks like.

ID     Date    Weight
1     2/1     160
1     2/1     159
1     2/1     160.5
2     2/1     200
2     2/1     198
2     2/1     201

I am not sure how to deal with this yet.
My expectation is to have the dataset look like this (only keeping the first observation)

ID     Date    Weight
1     2/1     160

2     2/1     200

答案1

得分: 1

# 在按 'ID' 和 'Date' 分组后,我们可以使用 `slice_head` 来筛选数据

library(dplyr)
df1 %>%
   group_by(ID, Date) %>%
   slice_head(n = 1) %>%
   ungroup

-output

# 一个 tibble: 2 × 3
     ID Date  Weight
  <int> <chr>  <dbl>
1     1 2/1      160
2     2 2/1      200

或者使用 base R 中的 duplicated

df1[!duplicated(df1[c("ID", "Date")]),]

数据

df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), Date = c("2/1", 
"2/1", "2/1", "2/1", "2/1", "2/1"), Weight = c(160, 159, 160.5, 
200, 198, 201)), class = "data.frame", row.names = c(NA, -6L))
英文:

We can use slice_head after grouping by 'ID' and 'Date'

library(dplyr)
df1 %&gt;%
   group_by(ID, Date) %&gt;%
   slice_head(n = 1) %&gt;%
   ungroup

-output

# A tibble: 2 &#215; 3
     ID Date  Weight
  &lt;int&gt; &lt;chr&gt;  &lt;dbl&gt;
1     1 2/1      160
2     2 2/1      200

Or with duplicated in base R

df1[!duplicated(df1[c(&quot;ID&quot;, &quot;Date&quot;)],]

data

df1 &lt;- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), Date = c(&quot;2/1&quot;, 
&quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;), Weight = c(160, 159, 160.5, 
200, 198, 201)), class = &quot;data.frame&quot;, row.names = c(NA, -6L))

答案2

得分: 1

如果您喜欢data.table(特别适用于大型数据集),您可以使用以下代码:

library(data.table)
df1 <- as.data.table(df1)
df1[ , rowNum := seq_len(.N),  by = .(ID, Date)]
df1 <- df1[rowNum == 1]
英文:

If you like data.table (especially fast for large data sets) you could go with:

library(data.table)
df1 &lt;- as.data.table(df1)
df1[ , rowNum := seq_len(.N),  by = .(ID, Date)]
df1 &lt;- df1[rowNum == 1]

答案3

得分: 1

另一种方法是使用 filter 结合 row_number()

library(dplyr)

df1 %>%
  group_by(ID) %>%
  filter(row_number() == 1) %>%
  ungroup
     ID Date  Weight
  <int> <chr>  <dbl>
1     1 2/1      160
2     2 2/1      200
英文:

Another way using filter combined with row_number():

library(dplyr)

df1 %&gt;%
  group_by(ID) %&gt;%
  filter(row_number() == 1) %&gt;%
  ungroup

     ID Date  Weight
  &lt;int&gt; &lt;chr&gt;  &lt;dbl&gt;
1     1 2/1      160
2     2 2/1      200

答案4

得分: 0

如果您只想保留第一个测量值,请使用否定的 duplicated

dat[!duplicated(dat$ID), ]
#   ID Date Weight
# 1  1  2/1    160
# 4  2  2/1    200

数据:

dat <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), Date = c("2/1", 
"2/1", "2/1", "2/1", "2/1", "2/1"), Weight = c(160, 159, 160.5, 
200, 198, 201)), class = "data.frame", row.names = c(NA, -6L))
英文:

If you simply want to keep only the first measurement, use negated duplicated.

dat[!duplicated(dat$ID), ]
#   ID Date Weight
# 1  1  2/1    160
# 4  2  2/1    200

Data:

dat &lt;- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), Date = c(&quot;2/1&quot;, 
&quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;, &quot;2/1&quot;), Weight = c(160, 159, 160.5, 
200, 198, 201)), class = &quot;data.frame&quot;, row.names = c(NA, -6L))

huangapple
  • 本文由 发表于 2023年2月9日 02:15:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75390091.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定