2023年4月13日 19:32:02go评论94阅读模式

英文:

How to create a new colum that identifies the last and second last row in longitudinal data using dplyr

问题

以下是您要的翻译部分：

我有长格式的数据，每个个体（通过“ID”标识）有多个观察（“Visit”）。每个个体的观察次数不同。我想创建一个新列，其中包括最后一次访问，我已经实现了，以及一个包括倒数第二次访问的列。
我的数据如下所示：
```R
ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
df <- data.frame(ID,Visit)

最后一列“lastVisit”是通过以下代码创建的：

df <- df %>%
  group_by(ID) %>%
  mutate(lastVisit = last(Visit))

期望的输出如下所示：

ID Visit lastVisit secondlastVisit
1000 BL    V03       V02
1000 V02   V03       V02
1000 V03   V03       V02
1001 BL    V04       V03
1001 V02   V04       V03
1001 V03   V04       V03
1001 V04   V04       V03
1002 BL    V05       V04
1002 V02   V05       V04
1002 V03   V05       V04
1002 V04   V05       V04
1002 V05   V05       V04

我尝试使用secondlastVisit = lag(Visit)，但这不会产生所需的输出。更喜欢使用dplyr::mutate的方法。

谢谢！


希望这有助于您的工作！如果您需要进一步的帮助，请随时告诉我。
<details>
<summary>英文:</summary>
I have data in long format with several observations (&quot;Visit&quot;) per individual (identified by &quot;ID&quot;). The number of observations per individual varies. I would like to create a new column with the last visit, which I have accomplished, and a column with the second last Visit.
My data looks like this:

ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
df <- data.frame(ID,Visit)


The lastVisit column is created by the following code:

df <- df %>%
group_by(ID) %>%
mutate(lastVisit = last(Visit))


The desired output is like this:

ID Visit lastVisit secondlastVisit
1000 BL V03 V02
1000 V02 V03 V02
1000 V03 V03 V02
1001 BL V04 V03
1001 V02 V04 V03
1001 V03 V04 V03
1001 V04 V04 V03
1002 BL V05 V04
1002 V02 V05 V04
1002 V03 V05 V04
1002 V04 V05 V04
1002 V05 V05 V04

I have tried using ```secondlastVisit  = lag(Visit)```, but this does not give the desired output. A method using dplyr::mutate is preferred. 
Thanks!
</details>
# 答案1
**得分**: 2
🧙🏾‍♂️: I understand you need a translation of the code snippet you provided. Here's the translated code:
``` r
函数`dplyr::nth()`可以实现你想要的功能；负索引表示从末尾开始。
library("magrittr")
library("dplyr")
ID <- c(1000, 1000, 1000, 1001, 1001, 1001, 1001, 1002, 1002, 1002, 1002, 1002)
Visit <- c("BL", "V02", "V03", "BL", "V02", "V03", "V04", "BL", "V02", "V03", "V04", "V05")
df <- data.frame(ID, Visit)
df <- df %>%
  group_by(ID) %>%
  mutate(lastVisit = last(Visit)) %>%
  mutate(secondlastVisit = nth(Visit, -2L))
df
#> # A tibble: 12 x 4
#> # Groups:   ID [3]
#>       ID Visit lastVisit secondlastVisit
#>    <dbl> <chr> <chr>     <chr>          
#>  1  1000 BL    V03       V02            
#>  2  1000 V02   V03       V02            
#>  3  1000 V03   V03       V02            
#>  4  1001 BL    V04       V03            
#>  5  1001 V02   V04       V03            
#>  6  1001 V03   V04       V03            
#>  7  1001 V04   V04       V03            
#>  8  1002 BL    V05       V04            
#>  9  1002 V02   V05       V04            
#> 10  1002 V03   V05       V04            
#> 11  1002 V04   V05       V04            
#> 12  1002 V05   V05       V04

Is there anything specific you'd like to know or do with this code? 🤔

英文:

The function dplyr::nth() does what you want; the negative index is to tell it to start from the end.

library(&quot;magrittr&quot;)
library(&quot;dplyr&quot;)
ID &lt;- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit &lt;- c(&quot;BL&quot;,&quot;V02&quot;,&quot;V03&quot;,&quot;BL&quot;,&quot;V02&quot;,&quot;V03&quot;,&quot;V04&quot;,&quot;BL&quot;,&quot;V02&quot;,&quot;V03&quot;,&quot;V04&quot;,&quot;V05&quot;)
df &lt;- data.frame(ID,Visit)
df &lt;- df %&gt;% 
  group_by(ID) %&gt;% 
  mutate(lastVisit = last(Visit)) %&gt;%
  mutate(secondlastVisit = nth(Visit, -2L))
df
#&gt; # A tibble: 12 x 4
#&gt; # Groups:   ID [3]
#&gt;       ID Visit lastVisit secondlastVisit
#&gt;    &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;     &lt;chr&gt;          
#&gt;  1  1000 BL    V03       V02            
#&gt;  2  1000 V02   V03       V02            
#&gt;  3  1000 V03   V03       V02            
#&gt;  4  1001 BL    V04       V03            
#&gt;  5  1001 V02   V04       V03            
#&gt;  6  1001 V03   V04       V03            
#&gt;  7  1001 V04   V04       V03            
#&gt;  8  1002 BL    V05       V04            
#&gt;  9  1002 V02   V05       V04            
#&gt; 10  1002 V03   V05       V04            
#&gt; 11  1002 V04   V05       V04            
#&gt; 12  1002 V05   V05       V04

<sup>Created on 2023-04-13 with reprex v2.0.2</sup>

答案2

得分: 1

df <- df %>%
  按ID分组 %>%
  添加列(secondLastVisit = Visit[which(lastVisit == Visit) - 1])

英文:

df &lt;- df %&gt;% 
  group_by(ID) %&gt;% 
  mutate(secondLastVisit = Visit[which(lastVisit == Visit) - 1])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to create a new colum that identifies the last and second last row in longitudinal data using dplyr

问题

答案2

将sf几何图形与转换后的几何图形合并。

合并数据框时，多个匹配项可能存在时，不重复数据。

如何在加载JavaScript文件之前显示旋转器？

Error in anova(): "Error in getResponseFormula(el) : 'form' must be a two-sided formula"

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。