2023年5月22日 11:19:21go评论113阅读模式

英文:

Find the very last character of the previous row in R

问题

以下是翻译的内容：

让我们假设我有一个像这样的数据框：

d <- data.frame(order = c("101", "01", "10", "01", "101"),
                dur_1 = c(50, 20, 40, 25, 45),
                dur_2 = c(40, 30, 45, 34, 96),
                dur_3 = c(20, 0, 0, 0, 125))

我想要做的是创建新的列 pre_order 和 pre_dur。
pre_order 是前一个 order 的最后一个字符。所以第一行应该是 NA，因为我们不知道前一个 order 的值。

pre_dur 是前一个 dur_ 列的最后一个非负值。我想要表示为 dur_，因为在我的实际数据中，我必须处理一个通用版本。所以我必须将其表示为 dur_。我的预期输出应该如下所示：

d1<-data.frame(order=c("101","01","10","01","101"),
              dur_1=c(50,20,40,25,45),
              dur_2=c(40,30,45,34,96),
              dur_3=c(20,0,0,0,125),
              pre_order=c(NA,1,1,0,1),
              pre_dur=c(NA,20,30,45,34))

我不知道如何在R中实现这一点。

英文:

Let's say I have a dataframe like this:

d &lt;- data.frame(order = c(&quot;101&quot;, &quot;01&quot;, &quot;10&quot;, &quot;01&quot;, &quot;101&quot;),
                dur_1 = c(50, 20, 40, 25, 45),
                dur_2 = c(40, 30, 45, 34, 96),
                dur_3 = c(20, 0, 0, 0, 125))

What I want to do is make new columns pre_order and pre_dur.
pre_order is the very last character of the previous order. So the first row should be NA. Because we don't know the previous value of order

pre_dur is the very last nonnegative value of the previous dur_ column. I want to express as dur_ beacause, in my actual data, I have to deal with a generalized version. So I have to express it as dur_. My expected output should look like this:

d1&lt;-data.frame(order=c(&quot;101&quot;,&quot;01&quot;,&quot;10&quot;,&quot;01&quot;,&quot;101&quot;),
              dur_1=c(50,20,40,25,45),
              dur_2=c(40,30,45,34,96),
              dur_3=c(20,0,0,0,125),
              pre_order=c(NA,1,1,0,1),
              pre_dur=c(NA,20,30,45,34))

I don't know how to do that in R

答案1

得分: 3

使用 max.col 和 substring 的基本变体。

i <- which(startsWith(names(d), "dur"))
j <- max.col(d[i] > 0, "last")
cbind(d, rbind(NA, data.frame(pre_order = substring(d$order, nchar(d$order)),
pre_dur = d[cbind(seq_len(nrow(d)), i[j])])[-nrow(d),]))
#  order dur_1 dur_2 dur_3 pre_order pre_dur
#1   101    50    40    20      <NA>    <NA>
#2    01    20    30     0         1      20
#3    10    40    45     0         1      30
#4    01    25    34     0         0      45
#5   101    45    96   125         1      34

英文:

A base variant using max.col and substring.

i &lt;- which(startsWith(names(d), &quot;dur&quot;))
j &lt;- max.col(d[i] &gt; 0, &quot;last&quot;)
cbind(d, rbind(NA,data.frame(pre_order = substring(d$order, nchar(d$order)),
pre_dur = d[cbind(seq_len(nrow(d)), i[j])])[-nrow(d),]))
#  order dur_1 dur_2 dur_3 pre_order pre_dur
#1   101    50    40    20      &lt;NA&gt;    &lt;NA&gt;
#2    01    20    30     0         1      20
#3    10    40    45     0         1      30
#4    01    25    34     0         0      45
#5   101    45    96   125         1      34

答案2

得分: 1

以下是您要翻译的内容：

Edit:

This approach solves the first part of your question but doesn't account for the "what if I have an unknown number of 'dur_' columns?" part of the question. I tried a few different solutions using purrr::map() but I wasn't able to get the correct output with your criteria.

GKi's solution successfully answers both parts of your question.

Original Answer:

If I've understood correctly, this is one potential solution:

library(tidyverse)
d <- data.frame(order = c("101", "01", "10", "01", "101"),
                dur_1 = c(50, 20, 40, 25, 45),
                dur_2 = c(40, 30, 45, 34, 96),
                dur_3 = c(20, 0, 0, 0, 125))
d1<-data.frame(order=c("101","01","10","01","101"),
               dur_1=c(50,20,40,25,45),
               dur_2=c(40,30,45,34,96),
               dur_3=c(20,0,0,0,125),
               pre_order=c(NA,1,1,0,1),
               pre_dur=c(NA,20,30,45,34))
d1
#>   order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1   101    50    40    20        NA      NA
#> 2    01    20    30     0         1      20
#> 3    10    40    45     0         1      30
#> 4    01    25    34     0         0      45
#> 5   101    45    96   125         1      34
d %>%
  mutate(pre_order = lag(str_sub(order, -1)),
         pre_dur = case_when(lag(dur_3) > 0 ~ lag(dur_3),
                             lag(dur_2) > 0 ~ lag(dur_2),
                             lag(dur_1) > 0 ~ lag(dur_1),
                             TRUE ~ NA_real_))
#>   order dur_1 dur_2 dur_3 pre_order pre_dur
#> 1   101    50    40    20      <NA>      NA
#> 2    01    20    30     0         1      20
#> 3    10    40    45     0         1      30
#> 4    01    25    34     0         0      45
#> 5   101    45    96   125         1      34

^{Created on 2023-05-22 with reprex v2.0.2}

英文:

Edit:

GKi's solution successfully answers both parts of your question.

Original Answer:

If I've understood correctly, this is one potential solution:

library(tidyverse)
d &lt;- data.frame(order = c(&quot;101&quot;, &quot;01&quot;, &quot;10&quot;, &quot;01&quot;, &quot;101&quot;),
                dur_1 = c(50, 20, 40, 25, 45),
                dur_2 = c(40, 30, 45, 34, 96),
                dur_3 = c(20, 0, 0, 0, 125))
d1&lt;-data.frame(order=c(&quot;101&quot;,&quot;01&quot;,&quot;10&quot;,&quot;01&quot;,&quot;101&quot;),
               dur_1=c(50,20,40,25,45),
               dur_2=c(40,30,45,34,96),
               dur_3=c(20,0,0,0,125),
               pre_order=c(NA,1,1,0,1),
               pre_dur=c(NA,20,30,45,34))
d1
#&gt;   order dur_1 dur_2 dur_3 pre_order pre_dur
#&gt; 1   101    50    40    20        NA      NA
#&gt; 2    01    20    30     0         1      20
#&gt; 3    10    40    45     0         1      30
#&gt; 4    01    25    34     0         0      45
#&gt; 5   101    45    96   125         1      34
d %&gt;%
  mutate(pre_order = lag(str_sub(order, -1)),
         pre_dur = case_when(lag(dur_3) &gt; 0 ~ lag(dur_3),
                             lag(dur_2) &gt; 0 ~ lag(dur_2),
                             lag(dur_1) &gt; 0 ~ lag(dur_1),
                             TRUE ~ NA_real_))
#&gt;   order dur_1 dur_2 dur_3 pre_order pre_dur
#&gt; 1   101    50    40    20      &lt;NA&gt;      NA
#&gt; 2    01    20    30     0         1      20
#&gt; 3    10    40    45     0         1      30
#&gt; 4    01    25    34     0         0      45
#&gt; 5   101    45    96   125         1      34

<sup>Created on 2023-05-22 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中找到上一行的最后一个字符

问题

答案1

答案2

Edit:

Original Answer:

Edit:

Original Answer:

读取CSV文件时出现多次跳过

如何在RMarkdown中为文中引用和图表超链接使用不同的颜色？

为什么加载 tidyverse 库时会出现这个错误？

如何从SQL中读取数值并存入字典的列表中。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论