2023年4月19日 21:58:25go评论96阅读模式

英文:

group rows data frame by multiple ranges of same column

问题

Here's the code you provided with the translated parts:

# 给定这些数据:
id <- c("1","1", "1","2","2","2","3","3","3","4","4","4","5","5","5","6","6","6")
value <- c("1", "2", "3", "4", "5", "6", "7", "8","9","10","11","12","13","14","15","16","17","18")
value2 <- c("1", "2", "3", "4", "5", "6", "7", "8","9","10","11","12","13","14","15","16","17","18")
value3 <- c("1", "2", "3", "4", "5", "6", "7", "8","9","10","11","12","13","14","15","16","17","18")
df <- data.frame(id, value, value2, value3)
# 我想按多个范围（group1: 1-2 和 5-6; group2:3-4）对行进行分组，并根据 value 进行汇总，以便最终结果如下所示:
newname <- c("newname1", "newname2")
sumvalues <- c("114", "57")
sumvalues2 <- c("114", "57")
sumvalues3 <- c("114", "57")
df2 <- data.frame(newname, sumvalues, sumvalues2, sumvalues3)
# 当新组（newname）有一个范围时，我已经尝试过以下方法，但我无法弄清如何将多个范围集成到一个新组中。
data_values_range <- data_values %>%
  # 将值聚合到范围中
  mutate(ranges = cut(group, seq(1, 6, 1))) %>%
  group_by(ranges) %>%
  summarize(sumvalues = sum(value)) %>%
  as.data.frame()
data_values_range

Note: I've translated the comments and variable names in the code, but the core code logic remains the same.

英文:

Given this data:

id &lt;- c(&quot;1&quot;,&quot;1&quot;, &quot;1&quot;,&quot;2&quot;,&quot;2&quot;,&quot;2&quot;,&quot;3&quot;,&quot;3&quot;,&quot;3&quot;,&quot;4&quot;,&quot;4&quot;,&quot;4&quot;,&quot;5&quot;,&quot;5&quot;,&quot;5&quot;,&quot;6&quot;,&quot;6&quot;,&quot;6&quot;)
value &lt;- c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, &quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;,&quot;9&quot;,&quot;10&quot;,&quot;11&quot;,&quot;12&quot;,&quot;13&quot;,&quot;14&quot;,&quot;15&quot;,&quot;16&quot;,&quot;17&quot;,&quot;18&quot;)
value2 &lt;- c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, &quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;,&quot;9&quot;,&quot;10&quot;,&quot;11&quot;,&quot;12&quot;,&quot;13&quot;,&quot;14&quot;,&quot;15&quot;,&quot;16&quot;,&quot;17&quot;,&quot;18&quot;)
value3 &lt;- c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, &quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;,&quot;9&quot;,&quot;10&quot;,&quot;11&quot;,&quot;12&quot;,&quot;13&quot;,&quot;14&quot;,&quot;15&quot;,&quot;16&quot;,&quot;17&quot;,&quot;18&quot;)
df &lt;- data.frame(id, value, value2, value3)

I would like to group the rows in two groups by multiple ranges (group1: 1-2 and 5-6; group2:3-4) and summarize by value so that the end result is as follows:

newname &lt;- c(&quot;newname1&quot;, &quot;newname2&quot;)
sumvalues &lt;- c(&quot;114&quot;, &quot;57&quot;)
sumvalues2 &lt;- c(&quot;114&quot;, &quot;57&quot;)
sumvalues3 &lt;- c(&quot;114&quot;, &quot;57&quot;)
df2 &lt;- data.frame(newname, sumvalues, sumvalues2, sumvalues3)

I have tried the following when there is one single range of each new group (newname) but I can't figure out how to integrate several ranges into one new group

data_values_range &lt;- data_values %&gt;%                        # Aggregate values in range
  mutate(ranges = cut(group,
                      seq(1, 6, 1))) %&gt;% 
  group_by(ranges) %&gt;% 
  dplyr::summarize(sumvalues = sum(value)) %&gt;% 
  as.data.frame()
data_values_range

in the case that there were more than one columns other than id, I would like that the end result shows the sum of the value of those columnes grouped by the new groups

答案1

得分: 1

以下是翻译好的部分：

# 我们可以使用以下代码
library(dplyr) # &gt;= 1.1.0
df %>%
  type.convert(as.is = TRUE) %>%
  group_by(newname = case_match(id, c(1, 2, 5, 6) ~ &#39;newname1&#39;,
    c(3, 4) ~ &#39;newname2&#39;,
    .default = &#39;other&#39;)) %>%
  select(-id) %>%
  reframe(across(where(is.numeric), ~ sum(.x, na.rm = TRUE),
    .names = &quot;sum{.col}&quot;))

-output

# 一个 tibble: 2 &#215; 4
  newname  sumvalue sumvalue2 sumvalue3
  &lt;chr&gt;       &lt;int&gt;     &lt;int&gt;     &lt;int&gt;
1 newname1      114       114       114
2 newname2       57        57        57


<details>
<summary>英文:</summary>
We could use

library(dplyr)# >= 1.1.0
df %>%
type.convert(as.is = TRUE) %>%
group_by(newname = case_match(id, c(1,2, 5, 6) ~ 'newname1',
c(3, 4)~ 'newname2',
.default = 'other')) %>%
select(-id) %>%
reframe(across(where(is.numeric), ~ sum(.x, na.rm = TRUE),
.names = "sum{.col}"))

-output

A tibble: 2 × 4

newname sumvalue sumvalue2 sumvalue3
<chr> <int> <int> <int>
1 newname1 114 114 114
2 newname2 57 57 57


</details>
# 答案2
**得分**: 0
你可以创建一个命名的组列表，然后以长格式获取它们，并将它们与原始的 `df` 连接，以对每个唯一的 `name` 进行求和。
```R
library(tidyverse)
groups <- list(newname1 = c(1, 2, 5, 6), newname2 = c(3, 4))
enframe(groups, value = "new_value") %>%
  unnest(new_value) %>%
  inner_join(df, by = c("new_value" = "id"), multiple = "all") %>%
  summarise(value = sum(value), .by = name)
#   name     value
#  <chr>    <int>
#1 newname1   114
#2 newname2    57

数据

我不确定为什么数据框 df 中的数字存储为字符。使用 type.convert 将其更改为数字。

df <- type.convert(df, as.is = TRUE)

英文:

You may create a named list of groups that you want to create. Get them in long format and join with original df to sum for each unique name.

library(tidyverse)
groups &lt;- list(newname1 = c(1, 2, 5, 6), newname2 = c(3, 4))
enframe(groups, value = &quot;new_value&quot;) %&gt;%
  unnest(new_value) %&gt;%
  inner_join(df, join_by(new_value == id), multiple = &quot;all&quot;)  %&gt;%
  summarise(value = sum(value), .by = name)
#   name     value
#  &lt;chr&gt;    &lt;int&gt;
#1 newname1   114
#2 newname2    57

data

I am not sure why the numbers are stored as characters in the dataframe df. Using type.convert will change them to numbers.

df &lt;- type.convert(df, as.is = TRUE)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将数据框按同一列的多个范围分组行。

问题

答案1

A tibble: 2 × 4

只在整数列上使用 purrr::map_df 函数。

Styling cells in a gt table based on detection of a string in their contents

快速在R中按年份拆分数据框。

如何在使用 left_join() 合并数据时保留标签？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。