用1逐行填充矩阵

huangapple go评论91阅读模式
英文:

populating a matrix with 1s row-wise

问题

我知道如何使用'rbinom'函数随机填充矩阵中的1。然而,我需要填充一个矩阵,使得每一行只有一个1,而且每一列至少有一个1,但多于一个1也可以。

我的空矩阵如下:

Chickens <- matrix(0, 600, 20)

所以我需要在矩阵中总共有600个'1'。感谢您的帮助!

英文:

I know how to randomly populate a matrix with 1s using the 'rbinom' function. However, I need to populate a matrix such that each row has only one 1 and every column has at least one 1 but more than one 1 is ok.

My empty matrix looks like this:

Chickens &lt;- matrix(0,600,20) 

So I need a total of 600 '1s' in the matrix. Thanks for your help!

答案1

得分: 3

如果行数远远大于列数,您可以使用sample生成介于1和20之间的随机数,并将其用于填充矩阵中的1。

set.seed(1)
Chickens[cbind(1:600, sample(1:20, 600, replace = TRUE))] <- 1

如果行数只比列数稍微多一些,您仍然可以从sample开始,但随后需要在某个时候删除重复的列。

英文:

If the number of rows is that much bigger than the number of columns you can use sample to generate randnom numbers between 1 and 20 and use that to fill the matrix with 1s.

set.seed(1)
Chickens[cbind(1:600, sample(1:20, 600, replace = TRUE))] &lt;- 1

If the number of rows should only be slightly larger than the number of columns you can still start with sample but you would than need to remove duplicated columns at some point.

答案2

得分: 3

以下是翻译好的部分:

没有必要事先创建结果矩阵,在这里有一种方法可以使用samplereplicate来实现。

replicate的帮助页面中得到的信息:

replicate是用于重复评估表达式的常见用法的包装器(通常涉及随机数生成)。

还请参考Allan Cameron的评论以及我对此的回答。

nr <- 600L
nc <- 20L
v <- c(1L, rep(0L, nc - 1L))
Chickens <- t(replicate(nr, sample(v)))

rowSums(Chickens)
#>   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#>  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#>  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [149] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [186] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [223] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [260] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [297] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [334] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [371] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [408] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [445] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [482] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [519] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [556] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [593] 1 1 1 1 1 1 1 1

编辑

回答[Paul的评论](https://stackoverflow.com/questions/76676438/populating-a-matrix-with-1s-row

英文:

There is no need to create the results matrix beforehand, here is a way with sample and replicate.

From the help page of replicate:

> replicate is a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).

See also Allan Cameron's comment and my answer to it.

nr &lt;- 600L
nc &lt;- 20L
v &lt;- c(1L, rep(0L, nc - 1L))
Chickens &lt;- t(replicate(nr, sample(v)))

rowSums(Chickens)
#&gt;   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt;  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt;  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [149] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [186] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [223] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [260] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [297] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [334] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [371] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [408] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [445] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [482] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [519] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [556] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#&gt; [593] 1 1 1 1 1 1 1 1

<sup>Created on 2023-07-13 with reprex v2.0.2</sup>


Edit

Answer to Paul's comment. Repeat the code above R times and compute the mean values of the Chicken matrix columns' sums. Add plots, a picture is worth a thousand words.

R &lt;- 1e3
sim &lt;- replicate(R, {
  Chickens &lt;- t(replicate(nr, sample(v)))
  colSums(Chickens)
})
str(sim)
#&gt;  num [1:20, 1:1000] 28 33 41 35 37 30 26 26 35 26 ...

# averaged across R replicates the means per column are close
# to 30, they approximately follow a uniform distribution
rowMeans(sim)
#&gt;  [1] 29.956 30.269 30.089 30.384 29.918 29.856 29.931 29.950 30.104 30.149
#&gt; [11] 29.829 29.710 30.049 29.968 30.332 30.077 29.871 29.732 29.890 29.936
barplot(rowMeans(sim))

用1逐行填充矩阵<!-- -->


# grand stats
mean(sim)
#&gt; [1] 30
hist(sim)
abline(v = mean(sim), col = &quot;blue&quot;, lty = &quot;dashed&quot;)

用1逐行填充矩阵<!-- -->

<sup>Created on 2023-07-13 with reprex v2.0.2</sup>

答案3

得分: 3

这是一个多项分布,size = 1,有20组和600个观测值:

在R中,您可以这样做:

Chickens <- t(rmultinom(600, 1, rep(1, 20)))
table(rowSums(Chickens))
   1 
 600 
dim(Chickens)
[1] 600  20

请注意,我们对数据进行了转置(t),因为观测值是以列的形式给出的。


另一种方法是从20个值中随机抽取600个值并转换为独热编码:

diag(20)[sample(20, 600, TRUE), ]

将其转换为独热向量的另一种方法是使用model.matrix

英文:

This is a multinomial distribution with size = 1, 20 groups and 600 observations:

in R you could do:

Chickens &lt;- t(rmultinom(600, 1, rep(1, 20)))
table(rowSums(Chickens))
   1 
 600 
dim(Chickens)
[1] 600  20

Notice that we t the data since the observations are given as columns.


Another way is to simply sample 600 values from 20 values and convert to one hot encoding:

diag(20)[sample(20, 600, TRUE), ]

Another way to convert to one-hot vector is to use model.matrix

huangapple
  • 本文由 发表于 2023年7月13日 14:18:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76676438.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定