2023年3月7日 23:43:21go评论94阅读模式

英文:

R Taking an average of a variable over intervals of another numeric variable

问题

如何计算第1列每个x间隔的第2列平均值，当间隔的行数不总是相等时？

看起来很简单，但我不确定从哪里开始。

df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

假设我想知道第1列从0到0.5，然后从0.5到1，1到1.5等的块平均值，但如果0到0.5有5行，0.5到1有9行，有什么最佳方法可以在不必指定行号的情况下执行此操作？

我已经尝试搜索，但也许我没有使用正确的关键词。

英文:

How would I go about calculating an average of column 2 for every x interval in column 1, when the number of rows for the intervals are not always equal?

It seems very simple but I'm not sure where to start.

df &lt;- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

Let's say I want to know the block average of column 2 when column 1 goes from 0 - 0.5 then 0.5 - 1 and 1 - 1.5 and so on, but if 0 - 0.5 are 5 rows and 0.5 - 1 are 9 rows, what is the best way to do this without having to specify the row numbers?

I have tried searching but perhaps I'm not using the right key words.

答案1

得分: 2

您可以使用 cut 函数根据 dist 的值进行分组：

tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5]  (0.5,1]  (1,1.5] 
# 47.6600  83.2375 112.0500

或者，如果您更喜欢使用 dplyr：

df %>%
  group_by(gp = cut(dist, seq(0, 1.5, .5))) %>%
  summarise(mean = mean(value)) %>%
  ungroup()

#       gp     mean
#1 (0,0.5]  47.6600
#2 (0.5,1]  83.2375
#3 (1,1.5] 112.0500

英文:

You can use cut to group according to the value of dist:

tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5]  (0.5,1]  (1,1.5] 
# 47.6600  83.2375 112.0500

Or, if you prefer dplyr:

df %&gt;% 
  group_by(gp = cut(dist, seq(0, 1.5, .5))) %&gt;% 
  summarise(mean = mean(value)) %&gt;%
  ungroup()

#       gp     mean
#1 (0,0.5]  47.6600
#2 (0.5,1]  83.2375
#3 (1,1.5] 112.0500

答案2

得分: 2

使用 base R 中的 aggregate

 aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)

输出

  grp    value
1 (0,0.5]  47.6600
2 (0.5,1]  83.2375
3 (1,1.5] 112.0500

英文:

Using aggregate in base R

 aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)

-output

  grp    value
1 (0,0.5]  47.6600
2 (0.5,1]  83.2375
3 (1,1.5] 112.0500

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用另一个数值变量的区间对一个变量进行平均。

问题

答案1

答案2

`bquote` 如何处理包含在 `.()` 中的术语的尺寸？

玻璃门网站抓取

如何在R中有条件地复制和编辑行

我可以帮你翻译这句话：如何在R中着色特定的县？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论