2023年2月16日 06:09:10go评论67阅读模式

英文:

How to extract the overlapping hospitalization periods in each hospital?

问题

df中有五个变量：studyid、hospitalname、Date1、Date2和group。对于df中hospitalname中的每个名称，我想提取所有日期从Date1到Date2与group为0和1的组之间重叠的组合。

library(zoo)

# 创建示例数据
df <- data.frame(
  studyid = 1:5,
  Date1 = as.yearmon(c("2020-01", "2020-03", "2020-10", "2020-07", "2020-06")),
  Date2 = as.yearmon(c("2020-02", "2020-10", "2021-02", "2020-08", "2020-10")),
  hospitalname = c("Hospital A", "Hospital A", "Hospital A", "Hospital B", "Hospital B"),
  group = c(0, 1, 0, 1, 0)
)

分析后，我想要的结果如下：

result <- data.frame(
  studyid.0 = c("3","5"),
  Date1_0 = as.yearmon(c("2020-10", "2020-06")),
  Date2_0 = as.yearmon(c("2021-02", "2020-10")),
  studyid.1 = c("2","4"),
  Date1_1 = as.yearmon(c("2020-03", "2020-07")),
  Date2_1 = as.yearmon(c("2020-10", "2020-08")),
  hospitalname = c("Hospital A", "Hospital B")
)

非常感谢您的支持。

英文:

There are five variables in df: studyid, hospitalname, Date1, Date2, and group. For each name in hospitalname in df, I would like to extract all combinations where the period from Date1 to Date2 overlaps between groups with group of 0 and 1.

library(zoo)

# create example data
df &lt;- data.frame(
  studyid = 1:5,
  Date1 = as.yearmon(c(&quot;2020-01&quot;, &quot;2020-03&quot;, &quot;2020-10&quot;, &quot;2020-07&quot;, &quot;2020-06&quot;)),
  Date2 = as.yearmon(c(&quot;2020-02&quot;, &quot;2020-10&quot;, &quot;2021-02&quot;, &quot;2020-08&quot;, &quot;2020-10&quot;)),
  hospitalname = c(&quot;Hospital A&quot;, &quot;Hospital A&quot;, &quot;Hospital A&quot;, &quot;Hospital B&quot;, &quot;Hospital B&quot;),
  group = c(0, 1, 0, 1, 0)
)

After the analysis, I would like to have the result like this

result &lt;- data.frame(
  studyid.0 = c(&quot;3&quot;,&quot;5&quot;),
  Date1_0 = as.yearmon(c(&quot;2020-10&quot;, &quot;2020-06&quot;)),
  Date2_0 = as.yearmon(c(&quot;2021-02&quot;, &quot;2020-10&quot;)),
  studyid.1 = c(&quot;2&quot;,&quot;4&quot;),
  Date1_1 = as.yearmon(c(&quot;2020-03&quot;, &quot;2020-07&quot;)),
  Date2_1 = as.yearmon(c(&quot;2020-10&quot;, &quot;2020-08&quot;)),
  hospitalname = c(&quot;Hospital A&quot;, &quot;Hospital B&quot;)
)

I truly appreciate your support.

答案1

得分: 1

library(dplyr)
library(zoo)

result <- df %>%
      inner_join(df, by = "hospitalname") %>%
      filter(group.x == 0, group.y == 1,
             Date1.x <= Date2.y, Date1.y <= Date2.x) %>%
      select(studyid.0 = studyid.x, Date1_0 = Date1.x, Date2_0 = Date2.x,
             studyid.1 = studyid.y, Date1_1 = Date1.y, Date2_1 = Date2.y,
             hospitalname) %>%
      distinct()

result
  studyid.0  Date1_0  Date2_0 studyid.1  Date1_1  Date2_1 hospitalname
1         3 Oct 2020 Feb 2021         2 Mar 2020 Oct 2020   Hospital A
2         5 Jun 2020 Oct 2020         4 Jul 2020 Aug 2020   Hospital B

或者，如@onyambu所提到的，您可以在inner_join()函数中添加 'suffix = c('_0', '_1')'，而不是使用select():

result <- df %>%
  inner_join(df, by = "hospitalname", suffix = c("_0", "_1")) %>%
  filter(group_0 == 0, group_1 == 1,
         Date1_0 <= Date2_1, Date1_1 <= Date2_0) %>%
  distinct()

这个 'suffix' 意味着所有具有相同名称的列将在第一个数据框中添加 '_0' 后缀，而在第二个数据框中添加 '_1' 后缀。

英文:

library(dplyr)
library(zoo)

result &lt;- df %&gt;%
      inner_join(df, by = &quot;hospitalname&quot;) %&gt;%
      filter(group.x == 0, group.y == 1,
             Date1.x &lt;= Date2.y, Date1.y &lt;= Date2.x) %&gt;%
      select(studyid.0 = studyid.x, Date1_0 = Date1.x, Date2_0 = Date2.x,
             studyid.1 = studyid.y, Date1_1 = Date1.y, Date2_1 = Date2.y,
             hospitalname) %&gt;%
      distinct()


result
  studyid.0  Date1_0  Date2_0 studyid.1  Date1_1  Date2_1 hospitalname
1         3 Oct 2020 Feb 2021         2 Mar 2020 Oct 2020   Hospital A
2         5 Jun 2020 Oct 2020         4 Jul 2020 Aug 2020   Hospital B

The inner_join() function creates combinations of rows with the same hospital name. Next, using filter() function you can choose rows where the date periods overlap between groups with group of 0 and 1. Then, using select() function renames the columns to match your desired output. Finally, the distinct() function removes duplicated rows created by the join function.

Alternatively, as was mentioned by @onyambu, you can add ' suffix = c('_0', '_1') ' within the inner_join() function instead of using select():

result &lt;- df %&gt;%
  inner_join(df, by = &quot;hospitalname&quot;, suffix = c(&quot;_0&quot;, &quot;_1&quot;)) %&gt;%
  filter(group_0 == 0, group_1 == 1,
         Date1_0 &lt;= Date2_1, Date1_1 &lt;= Date2_0) %&gt;%
  distinct()

This 'suffix' means that all columns with the same name will be suffixed with '_0' for the first data frame and '_1' for the second data frame.

答案2

得分: 0

df %>%
  mutate(id = cumsum(group)) %>%
  pivot_wider(id_cols = c(id, hospitalname), names_from = group,
               values_from = studyid:Date2, names_vary = 'slowest') %>%
  drop_na()

一个数据框：2 × 8

 id       hospitalname  studyid_0  Date1_0   Date2_0   studyid_1  Date1_1   Date2_1

1 1 Hospital A 3 Oct 2020 Feb 2021 2 Mar 2020 Oct 2020
2 2 Hospital B 5 Jun 2020 Oct 2020 4 Jul 2020 Aug 2020


<details>
<summary>英文:</summary>

       df %&gt;%
         mutate(id=cumsum(group)) %&gt;%
         pivot_wider(id_cols = c(id, hospitalname), names_from = group, 
                      values_from = studyid:Date2, names_vary = &#39;slowest&#39;) %&gt;%
         drop_na()

    # A tibble: 2 &#215; 8
         id hospitalname studyid_0 Date1_0   Date2_0   studyid_1 Date1_1   Date2_1  
      &lt;dbl&gt; &lt;chr&gt;            &lt;int&gt; &lt;yearmon&gt; &lt;yearmon&gt;     &lt;int&gt; &lt;yearmon&gt; &lt;yearmon&gt;
    1     1 Hospital A           3 Oct 2020  Feb 2021          2 Mar 2020  Oct 2020 
    2     2 Hospital B           5 Jun 2020  Oct 2020          4 Jul 2020  Aug 2020 

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何提取每家医院中重叠的住院期间？

问题

答案1

答案2

一个数据框：2 × 8

删除包含特定字符串的所有行在 R 中

用rgl在球上填充一个区域

R: 用于饼图的默认调色板

尝试制作带有多条线的折线图，但得到了一个空的图表。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论