英文:
Assign value of 1 per each group in dataset
问题
以下是已翻译的内容:
在下面提供的数据集中,我们可以看到ID重复三次,代表三种可能的选择,这意味着受访者必须选择其中的一种。最后一列chosen包含一个虚拟变量,每行默认值为0。
**问题:**我不明白如何随机分配1的值(表示选择了第一、第二或第三个替代方案)给每个组。例如,当ID == 1时,必须将值1随机分配给第一、第二或第三行,依此类推,直到数据的其余部分。
这是我尝试过的:
for (i in seq(1, nrow(sim_data), 3)) { # 循环遍历每组三行
chosen_index <- sample(i:(i+2), 1) # 在组内生成一个随机索引
sim_data$Chosen[chosen_index] <- 1 # 将1分配给选定的索引
}
由于我的数据有300行,循环只到298,所以它没有起作用。
数据集:
attributes <- expand.grid(
Company = c("Metalac", "NikolaTeslaAirport", "Jedinstvo", "Energoprojekt"),
Return_rate = c(0, 0.05, 0.10, 0.15),
Dividend = c(0, 1.5, 3.0, 4.5, 6),
Trend = c("Trend1", "Trend2", "Trend3")
)
# 为100名受访者生成模拟数据
set.seed(123) # 为了可重复性
sim_data <- data.frame(
ID = rep(1:100, each = 3), # 每个受访者三个选择
alternative = rep(1:3, times = 100), # 选择编号
attributes[sample(nrow(attributes), size = 100 * 3, replace = TRUE), ],
chosen = 0
)
英文:
In the dataset provided below, we can see that I have ID repeat three times for three possible alternatives meaning that a respondent has to choose 1 of 3 alternatives. The last column chosen contains a dummy variable which has 0 in each row as a default value.
Problem: I do not understand how can I randomly assign a value of 1 (indicating that first, second or third alternative was chosen) for each group. For example, when ID == 1, value 1 has to be randomly assigned either to first, second or third row and so on for the rast of the data.
Here's what I've tried:
for (i in seq(1, nrow(sim_data), 3)) { # loop through each group of three rows
chosen_index <- sample(i:(i+2), 1) # generate a random index within the group
sim_data$Chosen[chosen_index] <- 1 # assign 1 to the chosen index
}
Since my data has 300 rows and the cycle goes up to 298, it didn't work out.
Dataset:
attributes <- expand.grid(
Company = c("Metalac", "NikolaTeslaAirport", "Jedinstvo", "Energoprojekt"),
Return_rate = c(0, 0.05, 0.10, 0.15),
Dividend = c(0, 1.5, 3.0, 4.5, 6),
Trend = c("Trend1", "Trend2", "Trend3")
)
# Generate simulated data for 100 respondents
set.seed(123) # for reproducibility
sim_data <- data.frame(
ID = rep(1:100, each = 3), # three alternatives per respondent
alternative = rep(1:3, times = 100), # alternative number
attributes[sample(nrow(attributes), size = 100 * 3, replace = TRUE), ],
chosen = 0
)
答案1
得分: 0
以下是您要翻译的内容:
"I'm sure there are more elegant ways, but one dplyr
solution would be to use slice_sample()
to randomly sample 1 row per group, assign it a value of 1, then join the sampled data frame back with the full data frame (and then I do some clean up sorting back to the original and dropping temp variables)."
sim_data %>%
slice_sample(n = 1, by = ID) %>%
mutate(temp = TRUE) %>%
full_join(sim_data) %>%
mutate(chosen = case_when(temp ~ 1, TRUE ~ chosen)) %>%
arrange(ID, alternative) %>% select(-temp)
Output
ID alternative Company Return_rate Dividend Trend chosen
1 1 1 Jedinstvo 0.15 6.0 Trend2 1
2 1 2 Jedinstvo 0.15 3.0 Trend3 0
3 1 3 Jedinstvo 0.00 1.5 Trend3 0
4 2 1 NikolaTeslaAirport 0.15 0.0 Trend1 0
5 2 2 Jedinstvo 0.00 3.0 Trend3 1
6 2 3 NikolaTeslaAirport 0.10 0.0 Trend3 0
7 3 1 NikolaTeslaAirport 0.00 4.5 Trend1 1
8 3 2 NikolaTeslaAirport 0.05 3.0 Trend2 0
9 3 3 Jedinstvo 0.10 3.0 Trend1 0
# .....
英文:
I'm sure there are more elegant ways, but one dplyr
solution would be to use slice_sample()
to randomly sample 1 row per group, assign it a value of 1, then join the sampled data frame back with the full data frame (and then I do some clean up sorting back to the original and dropping temp variables).
sim_data %>%
slice_sample(n = 1, by = ID) %>%
mutate(temp = TRUE) %>%
full_join(sim_data) %>%
mutate(chosen = case_when(temp ~ 1, TRUE ~ chosen)) %>%
arrange(ID, alternative) %>% select(-temp)
# or with older dplyr versions
sim_data %>%
group_by(ID) %>%
slice_sample(n = 1) %>%
mutate(temp = TRUE) %>%
full_join(sim_data) %>%
mutate(chosen = case_when(temp ~ 1, TRUE ~ chosen)) %>%
arrange(ID, alternative) %>% select(-temp)
Output
ID alternative Company Return_rate Dividend Trend chosen
1 1 1 Jedinstvo 0.15 6.0 Trend2 1
2 1 2 Jedinstvo 0.15 3.0 Trend3 0
3 1 3 Jedinstvo 0.00 1.5 Trend3 0
4 2 1 NikolaTeslaAirport 0.15 0.0 Trend1 0
5 2 2 Jedinstvo 0.00 3.0 Trend3 1
6 2 3 NikolaTeslaAirport 0.10 0.0 Trend3 0
7 3 1 NikolaTeslaAirport 0.00 4.5 Trend1 1
8 3 2 NikolaTeslaAirport 0.05 3.0 Trend2 0
9 3 3 Jedinstvo 0.10 3.0 Trend1 0
# .....
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论