根据数据范围在R中每行计算事件数。

huangapple go评论72阅读模式
英文:

Count events in R per row depending on data range

问题

我有一个包含1000名患者(患者1-1000)的数据集,他们接受了相同类型的手术,但由不同的外科医生执行。我有兴趣在研究开始日期为2023年6月27日(例如)之前计算每位外科医生执行的手术次数,并将此次数插入到每个患者/行中。例如,我需要知道在对患者3进行手术之前外科医生A在之前的患者(1和2)上执行了多少次手术(例如,2次手术)。外科医生B等情况也相同。

我猜想在dplyr中有一些公式,但我无法理解它。

英文:

I have a dataset of 1000 patients (Patient 1-1000) undergoing the same type of procedure but by diffrent surgeons. I am interested to count the number of operations performed by each surgeon from study start date 27/06/2023 (for example) before the next procedure and insert this count/number per each patient/row. For example I need to know how many operations surgeon A performed on previous patients (1 and 2) before operating on patient 3 (e.g. 2 operation).Same for surgeon B etc.

I gues there is some formula in dplyr but I cannot get my head around it.

根据数据范围在R中每行计算事件数。

Patient Surgeon Operation Date	Event before index (operation date)
1	A	28/06/2023	0
2	A	29/06/2023	1
3	A	30/06/2023	2
4	B	1/07/2023	0
5	C	2/07/2023	1
6	C	3/07/2023	2

答案1

得分: 2

看起来您的期望结果是每位外科医生的行号减一。

library(dplyr)

your_data |>
mutate(event_before_index = row_number() - 1, .by = Surgeon)


假设您的数据已按日期排序。如果没有,请在运行上述命令之前将日期列转换为适当的 `Date` 类型并进行排序。

library(lubridate)
your_data |>
mutate(Operation_Date = dmy(Operation_Date)) |>
arrange(Surgeon, Operation_Date) |>
mutate(event_before_index = row_number() - 1, .by = Surgeon)

英文:

It looks like your desired result is the row number for each surgeon minus one.

library(dplyr)

your_data |>
  mutate(event_before_index = row_number() - 1, .by = Surgeon)

This assumes your data is already sorted by date. If not, you can convert your date column to a proper Date class and sort it before running the above command.

library(lubridate)
your_data |>
  mutate(Operation_Date = dmy(Operation_Date)) |>
  arrange(Surgeon, Operation_Date) |>
  mutate(event_before_index = row_number() - 1, .by = Surgeon)

答案2

得分: -1

请尝试以下代码,即使患者之前曾多次由外科医生进行手术,也可以提供“索引之前的事件(操作日期)”,请检查“new”变量。

然而,在某些地方它不匹配,即用“#”标记的地方。

library(dplyr)

df_2 <- df %>% group_by(Surgeon,Patient) %>% 
  slice_tail(n=1) %>% group_by(Surgeon) %>% mutate(new=row_number()-1)

df_or <- df %>% 
  left_join(df_2 %>% select(Patient, Surgeon, new), by=c('Surgeon','Patient')) 


# 输出

# A tibble: 6 × 5
  Patient Surgeon Operation_Date Event_before   new
    <dbl> <chr>   <date>                <dbl> <dbl>
1       1 A       2023-06-28                0     0
2       2 A       2023-06-29                1     1
3       3 A       2023-06-30                2     2
4       4 B       2023-07-01                0     0
5       5 C       2023-07-02                1     0 #
6       6 C       2023-07-03                2     1 #
英文:

Please try the below code which can be give the Event before index (operation date) even if the patient is repeated operated by surgeon previously
please check the new variable

however at some places it does not match i.e., highlighted with #

library(dplyr)

df_2 &lt;- df %&gt;% group_by(Surgeon,Patient) %&gt;% 
  slice_tail(n=1) %&gt;% group_by(Surgeon) %&gt;% mutate(new=row_number()-1)

df_or &lt;- df %&gt;% 
  left_join(df_2 %&gt;% select(Patient, Surgeon, new), by=c(&#39;Surgeon&#39;,&#39;Patient&#39;)) 


# output

# A tibble: 6 &#215; 5
  Patient Surgeon Operation_Date Event_before   new
    &lt;dbl&gt; &lt;chr&gt;   &lt;date&gt;                &lt;dbl&gt; &lt;dbl&gt;
1       1 A       2023-06-28                0     0
2       2 A       2023-06-29                1     1
3       3 A       2023-06-30                2     2
4       4 B       2023-07-01                0     0
5       5 C       2023-07-02                1     0 #
6       6 C       2023-07-03                2     1 #

huangapple
  • 本文由 发表于 2023年6月29日 01:01:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575303.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定