2023年2月10日 15:04:17go评论110阅读模式

英文:

How to generate a random integers vector that sums up to a given number in R

问题

我有一个名为my_df的数据框，我想要添加一个额外的列，名为my_new_column，并用随机整数填充，使它们相加等于给定的总和。以下是一些可重复的代码：

library(dplyr)
library(magrittr)
my_df <- as.data.frame(matrix(nrow = 10, ncol = 2))
colnames(my_df) <- c("Cat", "MarksA")
my_df$Cat <- LETTERS[1:nrow(my_df)]
my_df$MarksA <- sample(1:100, size = nrow(my_df))

在Tidyverse 风格中，我尝试了以下代码：

my_df %<>% mutate(my_new_column=sample(n()))

然而，这给我一个列，其总和是一个任意的数字。如何调整我的代码以实现这个任务？

英文:

I have a dataframe my_df and I would like to add an additional column, my_new_column, and populate it with random integer numbers that add up to a given sum.
Here is some reproducible code:

library(dplyr)
library(magrittr)
my_df &lt;- as.data.frame(matrix(nrow = 10, ncol = 2))
colnames(my_df) &lt;- c(&quot;Cat&quot;, &quot;MarksA&quot;)
my_df$Cat &lt;- LETTERS[1:nrow(my_df)]
my_df$MarksA &lt;- sample(1:100, size = nrow(my_df))

In Tidyverse style, I tried the following:

my_df %&lt;&gt;% mutate(my_new_column=sample(n()))

However, this gives me a column which sums up to an arbitrary number. How can I tweak my code to achieve this task?

答案1

得分: 3

由于您没有指定特定的分布，这个方法适用吗？我主要从这篇帖子中提取了我的答案，其中包含更多详细信息和更多选项：https://stackoverflow.com/questions/52559455/generate-non-negative-or-positive-random-integers-that-sum-to-a-fixed-value

my_df %>%
  mutate(int_sample = rmultinom(n = 1, size = 1000, prob = rep.int(1 / 10, 10)))

英文:

Since you didn't specify a specific distribution, would this work? I pulled my answer mostly from this post which has more details and more options: https://stackoverflow.com/questions/52559455/generate-non-negative-or-positive-random-integers-that-sum-to-a-fixed-value

my_df %&gt;%
  mutate(int_sample = rmultinom(n = 1, size = 1000, prob = rep.int(1 / 10, 10)))

答案2

得分: 0

Since the sum of all numbers between 1 and n is equal to n(n + 1)/2, you may try something like this:

nb &lt;- nrow(my_df)
my_df %&lt;&gt;% mutate(my_new_column = sample(nb * (nb + 1)/2))

英文:

Since the sum of all numbers between 1 and n is equal to n(n + 1)/2, you may try something like this :

nb &lt;- nrow(my_df)
my_df %&lt;&gt;% mutate(my_new_column = sample(nb * (nb + 1)/2))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

生成一个随机整数向量，使其总和等于给定的数字在R中

问题

答案1

答案2

基于连续的行创建分组，以在 ggplot 折线图中显示。

收集来自副本的多值结果到一个数据框中。

在R函数输入周围添加引号？

Struggling with rJava and dependent packages in Rstudio Mac M1. Error: package or namespace load failed for ‘rJava’:

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。