在R中随机重新排列矩阵的补丁。

huangapple go评论99阅读模式
英文:

Randomly rearranging patches of a matrix in R

问题

我正在尝试随机重新排列矩阵中的补丁。这需要针对较大的矩阵和小补丁进行操作,因此使用for循环似乎不是一种实现的选项。假设我有一个数据矩阵如下:

  1. data <- matrix(1:16, nrow = 4)

输出如下所示:

  1. [,1] [,2] [,3] [,4]
  2. [1,] 1 5 9 13
  3. [2,] 2 6 10 14
  4. [3,] 3 7 11 15
  5. [4,] 4 8 12 16

我现在想选择2x2的补丁并随机重新排列它们,以便输出可能看起来像这样:

  1. [,1] [,2] [,3] [,4]
  2. [1,] 11 15 3 7
  3. [2,] 12 16 4 8
  4. [3,] 9 13 1 5
  5. [4,] 10 14 2 6

到目前为止,我通过创建一个包含按正确顺序的索引的数字的矩阵,并对其进行重新排列来实现这一点,但是对于每个补丁在新的空矩阵中重新分配在存在数万个补丁时会变得非常耗时。

英文:

I am trying to randomly rearrange patches in a matrix. This will need to be done for larger matrices and small patches, so a for loop does not seem to be an option to achieve this. Let's say I have a data matrix like this:

data &lt;- matrix(1:16, nrow = 4)

The output looks like

  1. [,1] [,2] [,3] [,4]
  2. [1,] 1 5 9 13
  3. [2,] 2 6 10 14
  4. [3,] 3 7 11 15
  5. [4,] 4 8 12 16

I now want to select 2x2 patches and rearrange them randomly, so that the output may look like this

  1. [,1] [,2] [,3] [,4]
  2. [1,] 11 15 3 7
  3. [2,] 12 16 4 8
  4. [3,] 9 13 1 5
  5. [4,] 10 14 2 6

I achieved this so far by creating a matrix containing numbers corresponding to the index in correct order, and rearranged, but the reassignment to a new, empty matrix in a for loop for each patch gets quite time consuming when there are tens of thousands of patches.

答案1

得分: 4

以下是翻译好的代码部分:

  1. 这是一种你可以使用的方法。我假设子矩阵块可以均匀地划分成矩阵。
  2. data <- matrix(1:16, nrow = 4)
  3. # 矩阵维度
  4. nr <- nrow(data)
  5. nc <- ncol(data)
  6. # 块的宽度和高度
  7. width <- 2L
  8. height <- 2L
  9. # 矩阵维度与块维度的比率
  10. rr <- nr / height
  11. cr <- nc / width
  12. m <- matrix(1L, height, width)
  13. # 创建块索引
  14. (blocks <- matrix(seq(rr * cr), rr, cr) %x% m)
  15. [,1] [,2] [,3] [,4]
  16. [1,] 1 1 3 3
  17. [2,] 1 1 3 3
  18. [3,] 2 2 4 4
  19. [4,] 2 2 4 4
  20. # 创建随机块索引
  21. set.seed(5)
  22. (r_blocks <- matrix(sample(rr * cr), rr, cr) %x% m)
  23. [,1] [,2] [,3] [,4]
  24. [1,] 2 2 1 1
  25. [2,] 2 2 1 1
  26. [3,] 3 3 4 4
  27. [4,] 3 3 4 4
  28. # 通过匹配随机块位置与原始位置来创建新矩阵
  29. # 并索引到原始矩阵
  30. matrix(data[ave(r_blocks, r_blocks, FUN = \(v) which(blocks == v[1]))], dim(data))
  31. [,1] [,2] [,3] [,4]
  32. [1,] 3 7 1 5
  33. [2,] 4 8 2 6
  34. [3,] 9 13 11 15
  35. [4,] 10 14 12 16
英文:

Here is one approach you might use. I'm assuming that the submatrix blocks divide evenly into the matrix.

  1. data &lt;- matrix(1:16, nrow = 4)
  2. # Matrix dimensions
  3. nr &lt;- nrow(data)
  4. nc &lt;- ncol(data)
  5. # Block width and height
  6. width &lt;- 2L
  7. height &lt;- 2L
  8. # Matrix dimension by block dimension ratios
  9. rr &lt;- nr / height
  10. cr &lt;- nc / width
  11. m &lt;- matrix(1L, height, width)
  12. # Create block indices
  13. (blocks &lt;- matrix(seq(rr * cr), rr, cr) %x% m)
  14. [,1] [,2] [,3] [,4]
  15. [1,] 1 1 3 3
  16. [2,] 1 1 3 3
  17. [3,] 2 2 4 4
  18. [4,] 2 2 4 4
  19. # Create random block indices
  20. set.seed(5)
  21. (r_blocks &lt;- matrix(sample(rr * cr), rr, cr) %x% m)
  22. [,1] [,2] [,3] [,4]
  23. [1,] 2 2 1 1
  24. [2,] 2 2 1 1
  25. [3,] 3 3 4 4
  26. [4,] 3 3 4 4
  27. # Create new matrix by matching the random block positions against the original positions
  28. # and index against orginal matrix
  29. matrix(data[ave(r_blocks, r_blocks, FUN = \(v) which(blocks == v[1]))], dim(data))
  30. [,1] [,2] [,3] [,4]
  31. [1,] 3 7 1 5
  32. [2,] 4 8 2 6
  33. [3,] 9 13 11 15
  34. [4,] 10 14 12 16

答案2

得分: 3

以下是使用线性索引和嵌套的sequence调用的方法。它可以几乎立即重新排列一个700x700的矩阵。

  1. set.seed(211972494)
  2. rearrange <- function(x, n = 2L) {
  3. d <- dim(x)
  4. n2 <- n^2
  5. x[i[,sample(length(x)/n2)]] <- x[
  6. i <- matrix(
  7. sequence(
  8. rep(n, length(x)/n),
  9. sequence(
  10. rep(n, length(x)/n2),
  11. sequence(
  12. rep(d[1]/n, d[2]/n),
  13. seq(1, length(x), n*d[1]),
  14. n
  15. ), d[1]
  16. )
  17. ), n2
  18. )
  19. ]
  20. x
  21. }

测试:

  1. rearrange(matrix(1:16, 4))
  2. #> [,1] [,2] [,3] [,4]
  3. #> [1,] 11 15 3 7
  4. #> [2,] 12 16 4 8
  5. #> [3,] 1 5 9 13
  6. #> [4,] 2 6 10 14
  7. rearrange(matrix(1:54, 6), 3L)
  8. #> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
  9. #> [1,] 4 10 16 1 7 13 19 25 31
  10. #> [2,] 5 11 17 2 8 14 20 26 32
  11. #> [3,] 6 12 18 3 9 15 21 27 33
  12. #> [4,] 22 28 34 37 43 49 40 46 52
  13. #> [5,] 23 29 35 38 44 50 41 47 53
  14. #> [6,] 24 30 36 39 45 51 42 48 54

计时较大的矩阵:

  1. system.time(rearrange(matrix(1:700^2, 700)))
  2. #> user system elapsed
  3. #> 0.02 0.00 0.02
  4. system.time(rearrange(matrix(1:700^2, 700), 7L))
  5. #> user system elapsed
  6. #> 0 0 0

另一个选项使用kronecker。这个方法可能稍微慢一些,但逻辑可能更容易理解。

  1. rearrange2 <- function(x, n = 2L) {
  2. x[i[,sample(length(x)/n^2)]] <- x[
  3. i <- matrix(
  4. order(kronecker(array(seq_along(x), dim(x)/n), matrix(1L, n, n))),
  5. n^2
  6. )
  7. ]
  8. x
  9. }
  10. x <- matrix(1:700^2, 700)
  11. s <- rep(1:200, each = 2)
  12. i <- 0L
  13. microbenchmark::microbenchmark(
  14. rearrange = rearrange(x),
  15. rearrange2 = rearrange2(x),
  16. check = "equal",
  17. setup = set.seed(s[i <- i + 1L])
  18. )
  19. #> Unit: milliseconds
  20. #> expr min lq mean median uq max neval
  21. #> rearrange 12.4906 12.88795 14.03972 13.13315 13.81755 20.0663 100
  22. #> rearrange2 33.5341 33.93110 36.86671 34.79870 38.44955 74.1596 100
英文:

Here is an approach using linear indexing with nested sequence calls. It can rearrange a 700x700 matrix almost instantly.

  1. set.seed(211972494)
  2. rearrange &lt;- function(x, n = 2L) {
  3. d &lt;- dim(x)
  4. n2 &lt;- n^2
  5. x[i[,sample(length(x)/n2)]] &lt;- x[
  6. i &lt;- matrix(
  7. sequence(
  8. rep(n, length(x)/n),
  9. sequence(
  10. rep(n, length(x)/n2),
  11. sequence(
  12. rep(d[1]/n, d[2]/n),
  13. seq(1, length(x), n*d[1]),
  14. n
  15. ), d[1]
  16. )
  17. ), n2
  18. )
  19. ]
  20. x
  21. }

Testing:

  1. rearrange(matrix(1:16, 4))
  2. #&gt; [,1] [,2] [,3] [,4]
  3. #&gt; [1,] 11 15 3 7
  4. #&gt; [2,] 12 16 4 8
  5. #&gt; [3,] 1 5 9 13
  6. #&gt; [4,] 2 6 10 14
  7. rearrange(matrix(1:54, 6), 3L)
  8. #&gt; [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
  9. #&gt; [1,] 4 10 16 1 7 13 19 25 31
  10. #&gt; [2,] 5 11 17 2 8 14 20 26 32
  11. #&gt; [3,] 6 12 18 3 9 15 21 27 33
  12. #&gt; [4,] 22 28 34 37 43 49 40 46 52
  13. #&gt; [5,] 23 29 35 38 44 50 41 47 53
  14. #&gt; [6,] 24 30 36 39 45 51 42 48 54

Timing a larger matrix

  1. system.time(rearrange(matrix(1:700^2, 700)))
  2. #&gt; user system elapsed
  3. #&gt; 0.02 0.00 0.02
  4. system.time(rearrange(matrix(1:700^2, 700), 7L))
  5. #&gt; user system elapsed
  6. #&gt; 0 0 0

Another option using kronecker. It's a little slower, but the logic may be easier to follow.

  1. rearrange2 &lt;- function(x, n = 2L) {
  2. x[i[,sample(length(x)/n^2)]] &lt;- x[
  3. i &lt;- matrix(
  4. order(kronecker(array(seq_along(x), dim(x)/n), matrix(1L, n, n))),
  5. n^2
  6. )
  7. ]
  8. x
  9. }
  10. x &lt;- matrix(1:700^2, 700)
  11. s &lt;- rep(1:200, each = 2)
  12. i &lt;- 0L
  13. microbenchmark::microbenchmark(
  14. rearrange = rearrange(x),
  15. rearrange2 = rearrange2(x),
  16. check = &quot;equal&quot;,
  17. setup = set.seed(s[i &lt;- i + 1L])
  18. )
  19. #&gt; Unit: milliseconds
  20. #&gt; expr min lq mean median uq max neval
  21. #&gt; rearrange 12.4906 12.88795 14.03972 13.13315 13.81755 20.0663 100
  22. #&gt; rearrange2 33.5341 33.93110 36.86671 34.79870 38.44955 74.1596 100

答案3

得分: 1

以下是您提供的代码的翻译:

  1. # 补丁维度
  2. d1 <- 2
  3. d2 <- 2
  4. # 块矩阵的掩码
  5. msk <- kronecker(
  6. matrix(seq.int(length(data) / (d1 * d2)), nrow(data) / d1),
  7. matrix(1, d1, d2)
  8. )
  9. # 洗牌补丁
  10. l <- sample(unname(tapply(data, msk, \(x) matrix(x, d1))))
  11. # 重建矩阵
  12. do.call(
  13. cbind,
  14. tapply(
  15. l,
  16. ceiling(seq_along(l) / (nrow(data) / d1)),
  17. \(x) do.call(rbind, x)
  18. )
  19. )

它可以产生以下结果:

  1. [,1] [,2] [,3] [,4]
  2. [1,] 9 13 3 7
  3. [2,] 10 14 4 8
  4. [3,] 11 15 1 5
  5. [4,] 12 16 2 6
英文:

You can try the code below

  1. # patch dimensions
  2. d1 &lt;- 2
  3. d2 &lt;- 2
  4. # mask for block matrices
  5. msk &lt;- kronecker(
  6. matrix(seq.int(length(data) / (d1 * d2)), nrow(data) / d1),
  7. matrix(1, d1, d2)
  8. )
  9. # shuffle patches
  10. l &lt;- sample(unname(tapply(data, msk, \(x) matrix(x, d1))))
  11. # reconstruct the matrix
  12. do.call(
  13. cbind,
  14. tapply(
  15. l,
  16. ceiling(seq_along(l) / (nrow(data) / d1)),
  17. \(x) do.call(rbind, x)
  18. )
  19. )

and it could produce

  1. [,1] [,2] [,3] [,4]
  2. [1,] 9 13 3 7
  3. [2,] 10 14 4 8
  4. [3,] 11 15 1 5
  5. [4,] 12 16 2 6

huangapple
  • 本文由 发表于 2023年6月1日 10:15:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76378276.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定