英文:
R Taking an average of a variable over intervals of another numeric variable
问题
如何计算第1列每个x间隔的第2列平均值,当间隔的行数不总是相等时?
看起来很简单,但我不确定从哪里开始。
df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5),
value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))
假设我想知道第1列从0到0.5,然后从0.5到1,1到1.5等的块平均值,但如果0到0.5有5行,0.5到1有9行,有什么最佳方法可以在不必指定行号的情况下执行此操作?
我已经尝试搜索,但也许我没有使用正确的关键词。
英文:
How would I go about calculating an average of column 2 for every x interval in column 1, when the number of rows for the intervals are not always equal?
It seems very simple but I'm not sure where to start.
df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5),
value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))
Let's say I want to know the block average of column 2 when column 1 goes from 0 - 0.5 then 0.5 - 1 and 1 - 1.5 and so on, but if 0 - 0.5 are 5 rows and 0.5 - 1 are 9 rows, what is the best way to do this without having to specify the row numbers?
I have tried searching but perhaps I'm not using the right key words.
答案1
得分: 2
您可以使用 cut
函数根据 dist
的值进行分组:
tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5] (0.5,1] (1,1.5]
# 47.6600 83.2375 112.0500
或者,如果您更喜欢使用 dplyr
:
df %>%
group_by(gp = cut(dist, seq(0, 1.5, .5))) %>%
summarise(mean = mean(value)) %>%
ungroup()
# gp mean
#1 (0,0.5] 47.6600
#2 (0.5,1] 83.2375
#3 (1,1.5] 112.0500
英文:
You can use cut
to group according to the value of dist
:
tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5] (0.5,1] (1,1.5]
# 47.6600 83.2375 112.0500
Or, if you prefer dplyr
:
df %>%
group_by(gp = cut(dist, seq(0, 1.5, .5))) %>%
summarise(mean = mean(value)) %>%
ungroup()
# gp mean
#1 (0,0.5] 47.6600
#2 (0.5,1] 83.2375
#3 (1,1.5] 112.0500
答案2
得分: 2
使用 base R
中的 aggregate
aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)
输出
grp value
1 (0,0.5] 47.6600
2 (0.5,1] 83.2375
3 (1,1.5] 112.0500
英文:
Using aggregate
in base R
aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)
-output
grp value
1 (0,0.5] 47.6600
2 (0.5,1] 83.2375
3 (1,1.5] 112.0500
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论