2023年7月14日 02:35:32go评论103阅读模式

英文:

Copy values to rows based on conditions

问题

我正在尝试基于它们匹配的案例的索引日期复制控件的数据集的索引日期变量。在这个数据中，case = 1，control = 0。每对在"matchid"列中都有一个唯一的ID，时间=timepoint。我有以下示例数据集：

  Study_ID  time index_date  case matchid
   &lt;chr&gt;    &lt;dbl&gt;      &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          NA    0     1
 6 340        1          NA    0     1
 7 340        2          NA    0     1
 8 340        3          NA    0     1

我需要将行5-8的index_date列设置为"2"，基于"matchid"相同，使其看起来像下面这样：

  Study_ID  time index_date  case matchid
   &lt;chr&gt;    &lt;dbl&gt;      &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          2     0     1
 6 340        1          2     0     1
 7 340        2          2     0     1
 8 340        3          2     0     1

非常感谢您的帮助，因为类似问题的解决方案没有解决我的问题。

我已经尝试了以下Stack Overflow解决方案，但我收到了错误信息。

Stack Overflow链接1

Stack Overflow链接2

英文:

I have a dataset that I am trying to copy an index date variable for controls based on their matched case's index date. In this data, case = 1, control = 0. Each pair has a unique ID in the "matchid" column and time = the timepoint. I have the below sample dataset:

  Study_ID  time index_date  case matchid
   &lt;chr&gt;    &lt;dbl&gt;      &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          NA    0     1
 6 340        1          NA    0     1
 7 340        2          NA    0     1
 8 340        3          NA    0     1

I need the index_date column for rows 5-8 to be "2" based on "matchid" being the same so it would look like the below:

  Study_ID  time index_date  case matchid
   &lt;chr&gt;    &lt;dbl&gt;      &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          2     0     1
 6 340        1          2     0     1
 7 340        2          2     0     1
 8 340        3          2     0     1

Any help would be greatly appreciated as the solution for a similar question did not resolve my issue.

I have tried the below Stack Overflow solutions but I am getting errors.

https://stackoverflow.com/questions/67399813/copy-values-from-one-row-to-another-based-on-condition?newreg=c75a97bbb15f47fb87f7df5a19348948

https://stackoverflow.com/questions/33998856/r-copy-value-based-on-match-in-another-column

答案1

得分: 1

fill 应该执行的操作是：

library(tidyverse)
s = 'Study_ID  time index_date  case matchid
    
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          NA    0     1
 6 340        1          NA    0     1
 7 340        2          NA    0     1
 8 340        3          NA    0     1'
    
t = read.table(text = s)
t %>%
    group_by(matchid) %>%
    fill(index_date, .direction = 'down')
# Study_ID  time index_date  case matchid
# <int> <int> <int> <int> <int>
# 1      101     0          2     1       1
# 2      101     1          2     1       1
# 3      101     2          2     1       1
# 4      101     3          2     1       1
# 5      340     0          2     0       1
# 6      340     1          2     0       1
# 7      340     2          2     0       1
# 8      340     3          2     0       1

英文:

fill should do:

library(tidyverse)
s = &#39;Study_ID  time index_date  case matchid
 1 101        0          2     1     1
 2 101        1          2     1     1
 3 101        2          2     1     1
 4 101        3          2     1     1
 5 340        0          NA    0     1
 6 340        1          NA    0     1
 7 340        2          NA    0     1
 8 340        3          NA    0     1&#39;
t = read.table(text = s)
t %&gt;%
    group_by(matchid) %&gt;%
    fill(index_date, .direction = &#39;down&#39;)
# Study_ID  time index_date  case matchid
# &lt;int&gt; &lt;int&gt;      &lt;int&gt; &lt;int&gt;   &lt;int&gt;
# 1      101     0          2     1       1
# 2      101     1          2     1       1
# 3      101     2          2     1       1
# 4      101     3          2     1       1
# 5      340     0          2     0       1
# 6      340     1          2     0       1
# 7      340     2          2     0       1
# 8      340     3          2     0       1

答案2

得分: 0

或许是这样的？

library(dplyr)
quux %>% 
  mutate(
    index_date = if_else(is.na(index_date), na.omit(index_date)[1], index_date),
    .by = c(matchid, time)
  )
#   Study_ID time index_date case matchid
# 1      101    0          2    1       1
# 2      101    1          2    1       1
# 3      101    2          2    1       1
# 4      101    3          2    1       1
# 5      340    0          2    0       1
# 6      340    1          2    0       1
# 7      340    2          2    0       1
# 8      340    3          2    0       1

（注意：需要使用dplyr_1.1或更新版本才支持.by=；如果您使用较旧版本，请在mutate之前使用group_by(matchid, time)。）

我推测我们需要做的是将index_date中的所有NA值替换为每个由matchid和time定义的分组中第一个非NA值。

英文:

Perhaps this?

library(dplyr)
quux %&gt;%
  mutate(
    index_date = if_else(is.na(index_date), na.omit(index_date)[1], index_date),
    .by = c(matchid, time)
  )
#   Study_ID time index_date case matchid
# 1      101    0          2    1       1
# 2      101    1          2    1       1
# 3      101    2          2    1       1
# 4      101    3          2    1       1
# 5      340    0          2    0       1
# 6      340    1          2    0       1
# 7      340    2          2    0       1
# 8      340    3          2    0       1

(Note: .by= needs dplyr_1.1 or newer; if you have older, pre-use group_by(matchid, time) before the mutate.)

I'm inferring that what we need to do is replace all NA values with the first non-NA found in index_date within each group defined by matchid and time.

Data

quux &lt;- structure(list(Study_ID = c(101L, 101L, 101L, 101L, 340L, 340L, 340L, 340L), time = c(0L, 1L, 2L, 3L, 0L, 1L, 2L, 3L), index_date = c(2L, 2L, 2L, 2L, NA, NA, NA, NA), case = c(1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L), matchid = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = &quot;data.frame&quot;, row.names = c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, &quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将值根据条件复制到行中

问题

答案1

答案2

如何在R中对每一行单独应用if语句？

根据日期和唯一标识分配数字

在合并后的数据帧中添加新列，基于预先合并的数据帧。

在多边形内找到最接近的点。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。