英文:
Element-wise mean of every 12th matrix in array, repeated one sequence along 12 times, without for-loops
问题
我理解你只需要对代码部分进行翻译。以下是代码部分的翻译:
# 我有一个维度为[360, 180, 396]的数组。这些是经度、纬度和年-月的数据,为33年的月度数据。元素是该经度/纬度下的百分比。
# 我想从中制作一个摘要数组,以便在以后的分析中使用,而不必默认使用for循环。我想要获取每个月份在所有33年中的平均值,然后是所有年份的年平均值。
# 这是我用来容纳数据的摘要数组的示例。
mca <- array(data = NA,
dim = c(360, 180, 13),
dimnames = list(lon,
lat,
c(month.abb, "Ann")))
# 这是此示例的较小测试输入和输出数组
# 输入
set.seed(42)
smallin <- array(data = rnorm(n = 600, mean = 60, sd = 20),
dim = c(5, 5, 24))
# 输出(要填充)
smallout <- array(data = NA,
dim = c(5, 5, 13),
dimnames = list(c("1", "2", "3", "4", "5"),
c("-89.5", "-88.5", "-87.5", "-86.5", "-85.5"),
c(month.abb, "Ann")))
# 基于这个问题的第二个答案,我尝试了以下代码
jan <- apply(ca, c(seq(from = 1, to = 385, by = 12)), mean)
# 也可以这样写
ind_jan <- c(seq(from = 1, to = 385, by = 12))
jan <- apply(ca, ind_jan, mean)
# 我认为这等同于
jan <- apply(smallin, c(seq(from = 1, to = 13, by = 12)), mean)
# 但是,我遇到了错误:
# Error in apply(ca, c(seq(from = 1, to = 385, by = 12)), mean) :
# 'MARGIN' does not match dim(X)
# 我回到上面的查询并意识到margin = 1:2必须选择每个矩阵的全部内容(维度1和2)。
# 所以使用这个方法,我可以得到所有矩阵的平均值,这应该是输出数组的百分比的年平均值[,,13]
smallout[,,13] <- apply(smallin, 1:2, mean)
# 但是我仍然不知道如何让它只平均每12个矩阵,从1开始,然后从2开始,然后从3开始...
# 最接近的方法我认为是类似于以下的方式
ind_jan <- c(seq(from = 1, to = 13, by = 12))
smallout[,,1] <- apply(smallin[,,c(ind_jan)], 1:2, mean)
# 为数组中的每个输出矩阵重复上述步骤。是否有更少手动/更有效/更好的方法?
希望这有助于你的理解。如果你有任何其他问题,请随时提出。
英文:
I have an array of dimensions [360, 180, 396]. These are longitude, latitude, and month-year for 33 years of monthly data. The elements are percentages for that lat/lon.
From this I want to make a summary array I will use in later analyses without defaulting to using for loops. I want to get the mean of each months data for all 33 years, then the annual average for all years.
This is the blank of the summary array I made to contain the data.
mca <- array(data = NA,
dim = c(360,180,13),
dimnames = list(lon,
lat,
c(month.abb, "Ann")))
Here are smaller test input and output array for this example
#input
set.seed(42)
smallin <- array(data = rnorm(n = 600, mean = 60, sd = 20),
dim = c(5, 5, 24))
#output to fill
smallout <- array(data = NA,
dim = c(5,5,13),
dimnames = list(c("1", "2", "3", "4", "5"),
c("-89.5", "-88.5", "-87.5", "-86.5", "-85.5"),
c(month.abb, "Ann")))
Based on the second answer to this question I tried
jan <- apply(ca, c(seq(from = 1, to = 385, by = 12)), mean)
#also
ind_jan <- c(seq(from = 1, to = 385, by = 12))
jan <- apply(ca, ind_jan, mean)
which I think is equivalent to
jan <- apply(smallin, c(seq(from = 1, to = 13, by = 12)), mean)
thinking for margin I needed to put the 3rd dimension I wanted averaged, but received the error:
Error in apply(ca, c(seq(from = 1, to = 385, by = 12)), mean) :
'MARGIN' does not match dim(X)
I went back to the query above and realised margin = 1:2 must be selecting all of each matrix (dimensions 1 and 2). So using that I can get a mean of all the matrices which ought to be the annual average of the percentages for my output array [,,13],
smallout[,,13] <- apply(smallin, 1:2, mean)
but I still don't know how to get it to just average every 12th matrix starting from 1, then starting from 2, then starting from 3...
I have read the apply documentation, but found it unhelpful in this case/impenetrable. All of the suggested questions that came up seem to be in Python (or another language).
I also am not sure if I can do this all in one go, or a matrix by matrix passed to the output array by indexing as above.
The closest I can think is something like
ind_jan <- c(seq(from = 1, to = 13, by = 12))
smallout[,,1] <- apply(smallin[,,c(ind_jan)], 1:2, mean)
repeated for each of the output matrices in the array. Is there a less manual/more efficient/better way?
答案1
得分: 3
以下是您要翻译的内容:
考虑这个简化的数组 A
(见下面的数据)。
str(A)
# int [1:2, 1:3, 1:6] 1 1 1 1 1 1 2 2 2 2 ...
我们可以使用 sapply
来"循环"遍历年份,并使用选项 simplify='array'
来获得年度平均值的数组,
yrs <- seq_len(dim(A)[3]/nm)
sapply(yrs, \(i) apply(A[, , 1:nm + i - 1], 1:2, mean), simplify='array')
# , , 1
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
#
# , , 2
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
然后,相应地,获取跨年度的月度平均值:
mnt <- seq_len(nm)
sapply(mnt, \(i) apply(A[, , i], 1:2, mean), simplify='array')
# , , 1
#
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 1 1
#
# , , 2
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
#
# , , 3
#
# [,1] [,2] [,3]
# [1,] 3 3 3
# [2,] 3 3 3
数据:
nm <- 3 ## no. "months" ## actually 12 months in real years
ny <- 2 ## no. "years" ## in your case 33
A <- array(rep(1:nm, each=nm*ny), c(2, 3, nm*ny)) ## think this is your `ca`
英文:
Consider this simplified array A
(see data below).
str(A)
# int [1:2, 1:3, 1:6] 1 1 1 1 1 1 2 2 2 2 ...
We can use sapply
to "loop" over the years and option simplify='array'
to get back an array of the annual averages,
yrs <- seq_len(dim(A)[3]/nm)
sapply(yrs, \(i) apply(A[, , 1:nm + i - 1], 1:2, mean), simplify='array')
# , , 1
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
#
# , , 2
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
and, accordingly, of the monthly averages across the years:
mnt <- seq_len(nm)
sapply(mnt, \(i) apply(A[, , i], 1:2, mean), simplify='array')
# , , 1
#
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 1 1
#
# , , 2
#
# [,1] [,2] [,3]
# [1,] 2 2 2
# [2,] 2 2 2
#
# , , 3
#
# [,1] [,2] [,3]
# [1,] 3 3 3
# [2,] 3 3 3
Data:
nm <- 3 ## no. "months" ## actually 12 months in real years
ny <- 2 ## no. "years" ## in your case 33
A <- array(rep(1:nm, each=nm*ny), c(2, 3, nm*ny)) ## think this is your `ca`
答案2
得分: 2
你可以通过将最后一维分割成包含月份和年份的单独维度来为数组添加另一个维度。
使用以下代码:
i <- dim(smallin)
dim(smallin) <- c(i[1:2], 12L, i[3]/12L)
使用这个代码,你可以获得每个月份在所有年份中的平均值:
apply(smallin, 1:3, mean)
每个月的平均值将在结果中显示。
年度平均值(单年):
apply(smallin, c(1,2,4), mean)
年度平均值(所有年份):
apply(smallin, 1:2, mean)
这将为你提供每年的年度平均值和所有年份的年度平均值。
英文:
You can add another dimension to the array by splitting the last dimension, containing month and year to separate dimensions for month and year.
i <- dim(smallin)
dim(smallin) <- c(i[1:2], 12L, i[3]/12L)
With this you can get the averages for each month over all years with:
apply(smallin, 1:3, mean)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 73.66338 58.35988 72.33907 62.19628 52.08766
#[2,] 61.95544 79.93891 75.27725 49.30859 44.07820
#[3,] 64.02119 68.98285 35.76780 35.06961 58.79089
#[4,] 73.67935 67.72028 50.90479 23.22819 72.14434
#[5,] 62.57796 59.03798 64.53486 83.65987 97.04576
#
#...
#
#, , 12
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 83.55254 68.77645 48.88358 52.99573 56.82992
#[2,] 83.47723 39.02472 95.08051 65.97988 54.00097
#[3,] 47.59936 36.93396 38.35189 57.86126 83.99976
#[4,] 73.00906 53.71818 36.93229 80.85843 39.27094
#[5,] 81.67441 64.50031 62.71359 56.27758 54.01388
The annual average for single years:
apply(smallin, c(1,2,4), mean)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 60.77253 60.15417 54.71206 67.31820 62.05012
#[2,] 56.60298 59.14604 73.17469 57.66912 53.36540
#[3,] 56.52924 56.31096 58.73874 67.47850 59.06819
#[4,] 67.75999 56.45636 49.43743 55.14660 65.46497
#[5,] 60.28056 62.17656 55.08681 54.15788 60.05240
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 60.55035 65.21223 59.92112 59.75500 69.77088
#[2,] 60.89782 54.59722 55.17699 59.06815 60.03906
#[3,] 58.85733 54.02893 47.31326 63.10434 59.56569
#[4,] 60.96362 61.82648 55.45109 54.50272 45.21176
#[5,] 59.94452 54.31497 60.64839 64.65777 80.86525
The annual average over all years:
apply(smallin, 1:2, mean)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 60.66144 62.68320 57.31659 63.53660 65.91050
#[2,] 58.75040 56.87163 64.17584 58.36864 56.70223
#[3,] 57.69329 55.16994 53.02600 65.29142 59.31694
#[4,] 64.36180 59.14142 52.44426 54.82466 55.33836
#[5,] 60.11254 58.24577 57.86760 59.40782 70.45883
答案3
得分: 1
我确信有更好的方法,如果有人知道,请告诉我,我还是很愿意学习的,但是一旦我弄清楚如何做索引以便只选择每个月的数据来求均值,下面的方法就能够工作。
mca[,,1] <- apply(ca[,,c(seq(from = 1, to = 396, by = 12))], 1:2, mean)
mca[,,2] <- apply(ca[,,c(seq(from = 2, to = 396, by = 12))], 1:2, mean)
mca[,,3] <- apply(ca[,,c(seq(from = 3, to = 396, by = 12))], 1:2, mean)
mca[,,4] <- apply(ca[,,c(seq(from = 4, to = 396, by = 12))], 1:2, mean)
mca[,,5] <- apply(ca[,,c(seq(from = 5, to = 396, by = 12))], 1:2, mean)
mca[,,6] <- apply(ca[,,c(seq(from = 6, to = 396, by = 12))], 1:2, mean)
mca[,,7] <- apply(ca[,,c(seq(from = 7, to = 396, by = 12))], 1:2, mean)
mca[,,8] <- apply(ca[,,c(seq(from = 8, to = 396, by = 12))], 1:2, mean)
mca[,,9] <- apply(ca[,,c(seq(from = 9, to = 396, by = 12))], 1:2, mean)
mca[,,10] <- apply(ca[,,c(seq(from = 10, to = 396, by = 12))], 1:2, mean)
mca[,,11] <- apply(ca[,,c(seq(from = 11, to = 396, by = 12))], 1:2, mean)
mca[,,12] <- apply(ca[,,c(seq(from = 12, to = 396, by = 12))], 1:2, mean)
mca[,,13] <- apply(ca, 1:2, mean)
英文:
I am sure there is a better way, and if anyone has it I am still keen to learn, but the below turns out to work, once I figure out how to do the indexing in order to select just each months data to take the mean using apply.
mca[,,1] <- apply(ca[,,c(seq(from = 1, to = 396, by = 12))], 1:2, mean)
mca[,,2] <- apply(ca[,,c(seq(from = 2, to = 396, by = 12))], 1:2, mean)
mca[,,3] <- apply(ca[,,c(seq(from = 3, to = 396, by = 12))], 1:2, mean)
mca[,,4] <- apply(ca[,,c(seq(from = 4, to = 396, by = 12))], 1:2, mean)
mca[,,5] <- apply(ca[,,c(seq(from = 5, to = 396, by = 12))], 1:2, mean)
mca[,,6] <- apply(ca[,,c(seq(from = 6, to = 396, by = 12))], 1:2, mean)
mca[,,7] <- apply(ca[,,c(seq(from = 7, to = 396, by = 12))], 1:2, mean)
mca[,,8] <- apply(ca[,,c(seq(from = 8, to = 396, by = 12))], 1:2, mean)
mca[,,9] <- apply(ca[,,c(seq(from = 9, to = 396, by = 12))], 1:2, mean)
mca[,,10] <- apply(ca[,,c(seq(from = 10, to = 396, by = 12))], 1:2, mean)
mca[,,11] <- apply(ca[,,c(seq(from = 11, to = 396, by = 12))], 1:2, mean)
mca[,,12] <- apply(ca[,,c(seq(from = 12, to = 396, by = 12))], 1:2, mean)
mca[,,13] <- apply(ca, 1:2, mean)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论