使用另一个数值变量的区间对一个变量进行平均。

huangapple go评论75阅读模式
英文:

R Taking an average of a variable over intervals of another numeric variable

问题

如何计算第1列每个x间隔的第2列平均值,当间隔的行数不总是相等时?

看起来很简单,但我不确定从哪里开始。

df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

假设我想知道第1列从0到0.5,然后从0.5到1,1到1.5等的块平均值,但如果0到0.5有5行,0.5到1有9行,有什么最佳方法可以在不必指定行号的情况下执行此操作?

我已经尝试搜索,但也许我没有使用正确的关键词。

英文:

How would I go about calculating an average of column 2 for every x interval in column 1, when the number of rows for the intervals are not always equal?

It seems very simple but I'm not sure where to start.

df &lt;- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

Let's say I want to know the block average of column 2 when column 1 goes from 0 - 0.5 then 0.5 - 1 and 1 - 1.5 and so on, but if 0 - 0.5 are 5 rows and 0.5 - 1 are 9 rows, what is the best way to do this without having to specify the row numbers?

I have tried searching but perhaps I'm not using the right key words.

答案1

得分: 2

您可以使用 cut 函数根据 dist 的值进行分组:

tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5]  (0.5,1]  (1,1.5] 
# 47.6600  83.2375 112.0500 

或者,如果您更喜欢使用 dplyr

df %>%
  group_by(gp = cut(dist, seq(0, 1.5, .5))) %>%
  summarise(mean = mean(value)) %>%
  ungroup()

#       gp     mean
#1 (0,0.5]  47.6600
#2 (0.5,1]  83.2375
#3 (1,1.5] 112.0500
英文:

You can use cut to group according to the value of dist:

tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5]  (0.5,1]  (1,1.5] 
# 47.6600  83.2375 112.0500 

Or, if you prefer dplyr:

df %&gt;% 
  group_by(gp = cut(dist, seq(0, 1.5, .5))) %&gt;% 
  summarise(mean = mean(value)) %&gt;%
  ungroup()

#       gp     mean
#1 (0,0.5]  47.6600
#2 (0.5,1]  83.2375
#3 (1,1.5] 112.0500

答案2

得分: 2

使用 base R 中的 aggregate

 aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)

输出

  grp    value
1 (0,0.5]  47.6600
2 (0.5,1]  83.2375
3 (1,1.5] 112.0500
英文:

Using aggregate in base R

 aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)

-output

  grp    value
1 (0,0.5]  47.6600
2 (0.5,1]  83.2375
3 (1,1.5] 112.0500

huangapple
  • 本文由 发表于 2023年3月7日 23:43:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/75664130.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定