生成一个随机整数向量,使其总和等于给定的数字在R中

huangapple go评论69阅读模式
英文:

How to generate a random integers vector that sums up to a given number in R

问题

我有一个名为my_df的数据框,我想要添加一个额外的列,名为my_new_column,并用随机整数填充,使它们相加等于给定的总和。以下是一些可重复的代码:

library(dplyr)
library(magrittr)
my_df <- as.data.frame(matrix(nrow = 10, ncol = 2))
colnames(my_df) <- c("Cat", "MarksA")
my_df$Cat <- LETTERS[1:nrow(my_df)]
my_df$MarksA <- sample(1:100, size = nrow(my_df))

Tidyverse 风格中,我尝试了以下代码:

my_df %<>% mutate(my_new_column=sample(n()))

然而,这给我一个列,其总和是一个任意的数字。如何调整我的代码以实现这个任务?

英文:

I have a dataframe my_df and I would like to add an additional column, my_new_column, and populate it with random integer numbers that add up to a given sum.
Here is some reproducible code:

library(dplyr)
library(magrittr)
my_df &lt;- as.data.frame(matrix(nrow = 10, ncol = 2))
colnames(my_df) &lt;- c(&quot;Cat&quot;, &quot;MarksA&quot;)
my_df$Cat &lt;- LETTERS[1:nrow(my_df)]
my_df$MarksA &lt;- sample(1:100, size = nrow(my_df))

In Tidyverse style, I tried the following:

my_df %&lt;&gt;% mutate(my_new_column=sample(n()))

However, this gives me a column which sums up to an arbitrary number. How can I tweak my code to achieve this task?

答案1

得分: 3

由于您没有指定特定的分布,这个方法适用吗?我主要从这篇帖子中提取了我的答案,其中包含更多详细信息和更多选项:https://stackoverflow.com/questions/52559455/generate-non-negative-or-positive-random-integers-that-sum-to-a-fixed-value

my_df %>%
  mutate(int_sample = rmultinom(n = 1, size = 1000, prob = rep.int(1 / 10, 10)))
英文:

Since you didn't specify a specific distribution, would this work? I pulled my answer mostly from this post which has more details and more options: https://stackoverflow.com/questions/52559455/generate-non-negative-or-positive-random-integers-that-sum-to-a-fixed-value

my_df %&gt;%
  mutate(int_sample = rmultinom(n = 1, size = 1000, prob = rep.int(1 / 10, 10)))

答案2

得分: 0

Since the sum of all numbers between 1 and n is equal to n(n + 1)/2, you may try something like this:

nb &lt;- nrow(my_df)
my_df %&lt;&gt;% mutate(my_new_column = sample(nb * (nb + 1)/2))
英文:

Since the sum of all numbers between 1 and n is equal to n(n + 1)/2, you may try something like this :

nb &lt;- nrow(my_df)
my_df %&lt;&gt;% mutate(my_new_column = sample(nb * (nb + 1)/2))

huangapple
  • 本文由 发表于 2023年2月10日 15:04:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75407886.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定