2023年3月7日 08:52:38go评论90阅读模式

英文:

Split and aggregating results without a loop

问题

以下是翻译的代码部分：

I have a dataframe where I would like to group the results by name and type, and if my scenario has an &quot;up&quot; or &quot;down&quot;, report these as separate columns line (where the &quot;value&quot; is the minimum of the these two results).
Using string split and loops, I can sort through the columns, split the string, replace with N/A etc but this feels incredibly clumsy.
Can anyone suggest a better way to produce the preferred output that doesn&#39;t use a loop?
My input looks like:

input = structure(list(Name = c("Alice", "Alice ", "Tim", "Tim", "Greg",
"Greg"), Value = c("-5", "6", "5", "-2", "5", "7"), Type = c("Sales",
"Sales", "Returns", "Returns", "Promo", "Promo"), Scenario = c("Down",
"Up", "Down_RED_One", "Up_RED_One", "BLUE", "YELLOW")), row.names = c(NA,
6L), class = "data.frame")


And my preferred output is:

output = structure(list(Name = c("Alice", "Tim", "Greg", "Greg"), Value = c("-5",
"-2", "5", "7"), Type = c("Sales", "Returns", "Promo", "Promo"
), Scenario = c("N/A", "RED_One", "BLUE", "YELLOW"), Up = c("6",
"-2", "N/A", "N/A"), Down = c("-5", "5", "N/A", "N/A")), row.names = 1:4, class = "data.frame")

英文:

I have a dataframe where I would like to group the results by name and type, and if my scenario has an "up" or "down", report these as separate columns line (where the "value" is the minimum of the these two results).

Using string split and loops, I can sort through the columns, split the string, replace with N/A etc but this feels incredibly clumsy.

Can anyone suggest a better way to produce the preferred output that doesn't use a loop?

My input looks like:

input = structure(list(Name = c(&quot;Alice&quot;, &quot;Alice &quot;, &quot;Tim&quot;, &quot;Tim&quot;, &quot;Greg&quot;, 
&quot;Greg&quot;), Value = c(&quot;-5&quot;, &quot;6&quot;, &quot;5&quot;, &quot;-2&quot;, &quot;5&quot;, &quot;7&quot;), Type = c(&quot;Sales&quot;, 
&quot;Sales&quot;, &quot;Returns&quot;, &quot;Returns&quot;, &quot;Promo&quot;, &quot;Promo&quot;), Scenario = c(&quot;Down&quot;, 
&quot;Up&quot;, &quot;Down_RED_One&quot;, &quot;Up_RED_One&quot;, &quot;BLUE&quot;, &quot;YELLOW&quot;)), row.names = c(NA, 
6L), class = &quot;data.frame&quot;)

And my preferred output is:

output = structure(list(Name = c(&quot;Alice&quot;, &quot;Tim&quot;, &quot;Greg&quot;, &quot;Greg&quot;), Value = c(&quot;-5&quot;, 
&quot;-2&quot;, &quot;5&quot;, &quot;7&quot;), Type = c(&quot;Sales&quot;, &quot;Returns&quot;, &quot;Promo&quot;, &quot;Promo&quot;
), Scenario = c(&quot;N/A&quot;, &quot;RED_One&quot;, &quot;BLUE&quot;, &quot;YELLOW&quot;), Up = c(&quot;6&quot;, 
&quot;-2&quot;, &quot;N/A&quot;, &quot;N/A&quot;), Down = c(&quot;-5&quot;, &quot;5&quot;, &quot;N/A&quot;, &quot;N/A&quot;)), row.names = 1:4, class = &quot;data.frame&quot;)

答案1

得分: 2

对"Up"和"Down"进行一些初始转换，获取最小值，然后使用tidyr::pivot_wider()函数：


library(dplyr)   # &gt;= v1.1.0
library(stringr)
library(tidyr)
input %&gt;%
  mutate(
    UpDown = str_extract(Scenario, &quot;Up|Down&quot;),
    Scenario = na_if(str_remove(Scenario, &quot;_?(Up|Down)_?&quot;), &quot;&quot;),
    ValuePivot = Value
  ) %&gt;%
  mutate(Value = min(Value), .by = c(Name, Scenario)) %&gt;%
  pivot_wider(names_from = UpDown, values_from = ValuePivot) %&gt;%
  select(!`NA`)

# 一个tibble：4 × 6
  Name  Value Type    Scenario Down  Up   
1 Alice -5    Sales   <NA>     -5    6    
2 Tim   -2    Returns RED_One  5     -2   
3 Greg  5     Promo   BLUE     <NA>  <NA> 
4 Greg  7     Promo   YELLOW   <NA>  <NA>

请注意，"Alice"的一个值末尾带有空格，我假设这是一个错误并已删除。

英文:

Do some initial transformation to extract "Up" and "Down" and get your minima, then tidyr::pivot_wider():


library(dplyr)   # &gt;= v1.1.0
library(stringr)
library(tidyr)
input %&gt;%
  mutate(
    UpDown = str_extract(Scenario, &quot;Up|Down&quot;),
    Scenario = na_if(str_remove(Scenario, &quot;_?(Up|Down)_?&quot;), &quot;&quot;),
    ValuePivot = Value
  ) %&gt;%
  mutate(Value = min(Value), .by = c(Name, Scenario)) %&gt;%
  pivot_wider(names_from = UpDown, values_from = ValuePivot) %&gt;%
  select(!`NA`)

# A tibble: 4 &#215; 6
  Name  Value Type    Scenario Down  Up   
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;   &lt;chr&gt;    &lt;chr&gt; &lt;chr&gt;
1 Alice -5    Sales   &lt;NA&gt;     -5    6    
2 Tim   -2    Returns RED_One  5     -2   
3 Greg  5     Promo   BLUE     &lt;NA&gt;  &lt;NA&gt; 
4 Greg  7     Promo   YELLOW   &lt;NA&gt;  &lt;NA&gt;

Note, one value of "Alice" had a trailing space, which I assumed was an error and removed.

答案2

得分: 1

我的答案有效，但我不禁想到一定有更好的方法。

我已经添加了几个注释来解释每个部分的作用和原因。如果您有任何问题，请告诉我。

library(tidyverse)
input %>%
  separate_wider_regex(Scenario,                                # 将“Up”和“Down”分开
                       patterns = c(Up = "Up|Down|", Scenario = ".*"),
                       too_few = "error") %>%
  mutate(Scenario = trimws(Scenario, which = "left", whitespace = "_"), # 去掉前导下划线
         Name = trimws(Name, "both"),       # 去掉导致不匹配的空格
         Up = ifelse(Up == "", NA, Up)) %>% # 将空字符串更改为NA以进行数据透视
  pivot_wider(values_from = Value, names_from = Up)
# # A tibble: 4 × 6
#   Name  Type    Scenario  Down  Up    `NA` 
#   <chr> <chr>   <chr>     <chr> <chr> <chr>
# 1 Alice Sales   ""        -5    6     <NA> 
# 2 Tim   Returns "RED_One" 5     -2    <NA> 
# 3 Greg  Promo   "BLUE"    <NA>  <NA>  5    
# 4 Greg  Promo   "YELLOW"  <NA>  <NA>  7

英文:

My answer works, but I can't help but think there has to be a better way.

I have added several comments to explain what does what and why. If you have any questions, let me know.

library(tidyverse)
input %&gt;% 
  separate_wider_regex(Scenario,                                # separate up and down
                       patterns = c(Up = &quot;Up|Down|&quot;, Scenario = &quot;.*&quot;),
                       too_few = &quot;error&quot;) %&gt;% 
  mutate(Scenario = trimws(Scenario, which = &quot;left&quot;, whitespace = &quot;_&quot;), # drop leading _
         Name = trimws(Name, &quot;both&quot;),       # drop whitespace causing mismatch
         Up = ifelse(Up == &quot;&quot;, NA, Up)) %&gt;% # change blank strings to NA for pivot
  pivot_wider(values_from = Value, names_from = Up)
# # A tibble: 4 &#215; 6
#   Name  Type    Scenario  Down  Up    `NA` 
#   &lt;chr&gt; &lt;chr&gt;   &lt;chr&gt;     &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
# 1 Alice Sales   &quot;&quot;        -5    6     &lt;NA&gt; 
# 2 Tim   Returns &quot;RED_One&quot; 5     -2    &lt;NA&gt; 
# 3 Greg  Promo   &quot;BLUE&quot;    &lt;NA&gt;  &lt;NA&gt;  5    
# 4 Greg  Promo   &quot;YELLOW&quot;  &lt;NA&gt;  &lt;NA&gt;  7

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

分割和汇总结果，无需循环。

问题

答案1

答案2

Calculate Row Decile/Quantile by Column Dplyr R

将2行合并为一行，添加额外列。

ggplotly使多边形的边界变为透明。有解决方法吗？

理解为什么tune::last_fit的指标与summary()不同。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。