英文:
Find the very last character of the previous row in R
问题
以下是翻译的内容:
让我们假设我有一个像这样的数据框:
d <- data.frame(order = c("101", "01", "10", "01", "101"),
dur_1 = c(50, 20, 40, 25, 45),
dur_2 = c(40, 30, 45, 34, 96),
dur_3 = c(20, 0, 0, 0, 125))
我想要做的是创建新的列 pre_order
和 pre_dur
。
pre_order
是前一个 order
的最后一个字符。所以第一行应该是 NA,因为我们不知道前一个 order
的值。
pre_dur
是前一个 dur_
列的最后一个非负值。我想要表示为 dur_
,因为在我的实际数据中,我必须处理一个通用版本。所以我必须将其表示为 dur_
。我的预期输出应该如下所示:
d1<-data.frame(order=c("101","01","10","01","101"),
dur_1=c(50,20,40,25,45),
dur_2=c(40,30,45,34,96),
dur_3=c(20,0,0,0,125),
pre_order=c(NA,1,1,0,1),
pre_dur=c(NA,20,30,45,34))
我不知道如何在R中实现这一点。
英文:
Let's say I have a dataframe like this:
d <- data.frame(order = c("101", "01", "10", "01", "101"),
dur_1 = c(50, 20, 40, 25, 45),
dur_2 = c(40, 30, 45, 34, 96),
dur_3 = c(20, 0, 0, 0, 125))
What I want to do is make new columns pre_order
and pre_dur
.
pre_order
is the very last character of the previous order
. So the first row should be NA. Because we don't know the previous value of order
pre_dur
is the very last nonnegative value of the previous dur_
column. I want to express as dur_
beacause, in my actual data, I have to deal with a generalized version. So I have to express it as dur_
. My expected output should look like this:
d1<-data.frame(order=c("101","01","10","01","101"),
dur_1=c(50,20,40,25,45),
dur_2=c(40,30,45,34,96),
dur_3=c(20,0,0,0,125),
pre_order=c(NA,1,1,0,1),
pre_dur=c(NA,20,30,45,34))
I don't know how to do that in R
答案1
得分: 3
使用 max.col
和 substring
的基本变体。
i <- which(startsWith(names(d), "dur"))
j <- max.col(d[i] > 0, "last")
cbind(d, rbind(NA, data.frame(pre_order = substring(d$order, nchar(d$order)),
pre_dur = d[cbind(seq_len(nrow(d)), i[j])])[-nrow(d),]))
# order dur_1 dur_2 dur_3 pre_order pre_dur
#1 101 50 40 20 <NA> <NA>
#2 01 20 30 0 1 20
#3 10 40 45 0 1 30
#4 01 25 34 0 0 45
#5 101 45 96 125 1 34
英文:
A base variant using max.col
and substring
.
i <- which(startsWith(names(d), "dur"))
j <- max.col(d[i] > 0, "last")
cbind(d, rbind(NA,data.frame(pre_order = substring(d$order, nchar(d$order)),
pre_dur = d[cbind(seq_len(nrow(d)), i[j])])[-nrow(d),]))
# order dur_1 dur_2 dur_3 pre_order pre_dur
#1 101 50 40 20 <NA> <NA>
#2 01 20 30 0 1 20
#3 10 40 45 0 1 30
#4 01 25 34 0 0 45
#5 101 45 96 125 1 34
答案2
得分: 1
以下是您要翻译的内容:
Edit:
This approach solves the first part of your question but doesn't account for the "what if I have an unknown number of 'dur_' columns?" part of the question. I tried a few different solutions using purrr::map()
but I wasn't able to get the correct output with your criteria.
GKi's solution successfully answers both parts of your question.
Original Answer:
If I've understood correctly, this is one potential solution:
library(tidyverse)
d <- data.frame(order = c("101", "01", "10", "01", "101"),
dur_1 = c(50, 20, 40, 25, 45),
dur_2 = c(40, 30, 45, 34, 96),
dur_3 = c(20, 0, 0, 0, 125))
d1<-data.frame(order=c("101","01","10","01","101"),
dur_1=c(50,20,40,25,45),
dur_2=c(40,30,45,34,96),
dur_3=c(20,0,0,0,125),
pre_order=c(NA,1,1,0,1),
pre_dur=c(NA,20,30,45,34))
d1
#> order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1 101 50 40 20 NA NA
#> 2 01 20 30 0 1 20
#> 3 10 40 45 0 1 30
#> 4 01 25 34 0 0 45
#> 5 101 45 96 125 1 34
d %>%
mutate(pre_order = lag(str_sub(order, -1)),
pre_dur = case_when(lag(dur_3) > 0 ~ lag(dur_3),
lag(dur_2) > 0 ~ lag(dur_2),
lag(dur_1) > 0 ~ lag(dur_1),
TRUE ~ NA_real_))
#> order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1 101 50 40 20 <NA> NA
#> 2 01 20 30 0 1 20
#> 3 10 40 45 0 1 30
#> 4 01 25 34 0 0 45
#> 5 101 45 96 125 1 34
Created on 2023-05-22 with reprex v2.0.2
英文:
Edit:
This approach solves the first part of your question but doesn't account for the "what if I have an unknown number of 'dur_' columns?" part of the question. I tried a few different solutions using purrr::map()
but I wasn't able to get the correct output with your criteria.
GKi's solution successfully answers both parts of your question.
Original Answer:
If I've understood correctly, this is one potential solution:
library(tidyverse)
d <- data.frame(order = c("101", "01", "10", "01", "101"),
dur_1 = c(50, 20, 40, 25, 45),
dur_2 = c(40, 30, 45, 34, 96),
dur_3 = c(20, 0, 0, 0, 125))
d1<-data.frame(order=c("101","01","10","01","101"),
dur_1=c(50,20,40,25,45),
dur_2=c(40,30,45,34,96),
dur_3=c(20,0,0,0,125),
pre_order=c(NA,1,1,0,1),
pre_dur=c(NA,20,30,45,34))
d1
#> order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1 101 50 40 20 NA NA
#> 2 01 20 30 0 1 20
#> 3 10 40 45 0 1 30
#> 4 01 25 34 0 0 45
#> 5 101 45 96 125 1 34
d %>%
mutate(pre_order = lag(str_sub(order, -1)),
pre_dur = case_when(lag(dur_3) > 0 ~ lag(dur_3),
lag(dur_2) > 0 ~ lag(dur_2),
lag(dur_1) > 0 ~ lag(dur_1),
TRUE ~ NA_real_))
#> order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1 101 50 40 20 <NA> NA
#> 2 01 20 30 0 1 20
#> 3 10 40 45 0 1 30
#> 4 01 25 34 0 0 45
#> 5 101 45 96 125 1 34
<sup>Created on 2023-05-22 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论