英文:
Is there an R function to find the highest value in a row that does not match other names in a column?
问题
以下是我目前正在使用的R代码:
library('tidyverse')
library('dplyr')
Brian <- c(92.835, 89.035, 99.222, 93.581)
Buckley <- c(75.265, 86.258, 93.972, 96.872)
Chris <- c(91.442, 103.999, 91.291, 92.505)
Catherine <- c(81.244, 73.040, 78.455, 98.972)
David <- c(87.153, 60.062, 62.248, 87.852)
Donald <- c(93.395, 91.905, 102.502, 107.63)
Greg <- c(79.571, 73.702, 67.326, 89.493)
Matt <- c(78.585, 48.074, 81.387, 76.074)
Michael <- c(96.933, 78.709, 82.623, 66.325)
df <- data.frame(Brian, Buckley, Chris, Catherine, David, Donald, Greg, Matt, Michael)
group1 <- data.frame(Brian, Matt, Michael)
group2 <- data.frame(Buckley, Chris, Catherine)
group3 <- data.frame(David, Donald, Greg)
group1a <- group1 %>%
mutate(Group1 = names(.)[max.col(.)])
group2a <- group2 %>%
mutate(Group2 = names(.)[max.col(.)])
group3a <- group3 %>%
mutate(Group3 = names(.)[max.col(.)])
GROUP1 <- dplyr::pull(group1a, 'Group1')
GROUP2 <- dplyr::pull(group2a, 'Group2')
GROUP3 <- dplyr::pull(group3a, 'Group3')
ALL <- cbind(df, GROUP1, GROUP2, GROUP3)
该代码显示了一个较长表格的前4行。我一直在努力寻找一段代码,可以将一个列附加到这个表格中,该列包括不是列GROUP1、GROUP2、GROUP3中的列中具有最高值的列的名称。输出列将以“GROUP4”为标题,并包括这四行的Brian、Buckley、Chris、Buckley。
我尝试在dplyr中查找适合这个问题的代码,但我是新手,一直在困惑。
英文:
Here is the R code I am currently using:
library('tidyverse')
library ('dplyr')
Brian <- c(92.835, 89.035, 99.222, 93.581)
Buckley <- c(75.265, 86.258, 93.972, 96.872)
Chris <- c(91.442, 103.999, 91.291, 92.505)
Catherine <- c(81.244, 73.040, 78.455, 98.972)
David <- c(87.153, 60.062, 62.248, 87.852)
Donald <- c(93.395, 91.905, 102.502, 107.63)
Greg <- c(79.571, 73.702, 67.326, 89.493)
Matt <- c(78.585, 48.074, 81.387, 76.074)
Michael <- c(96.933, 78.709, 82.623, 66.325)
df <- data.frame(Brian, Buckley, Chris, Catherine, David, Donald, Greg, Matt, Michael)
group1 <- data.frame(Brian, Matt, Michael)
group2 <- data.frame(Buckley, Chris, Catherine)
group3 <- data.frame(David, Donald, Greg)
group1a <- group1 %>%
mutate(Group1 = names(.)[max.col(.)])
group2a <- group2 %>%
mutate(Group2 = names(.)[max.col(.)])
group3a <- group3 %>%
mutate(Group3 = names(.)[max.col(.)])
GROUP1 <- dplyr::pull(group1a, 'Group1')
GROUP2 <- dplyr::pull(group2a, 'Group2')
GROUP3 <- dplyr::pull(group3a, 'Group3')
ALL <- cbind(df, GROUP1, GROUP2, GROUP3)
The code shows 4 rows of a much longer table. I've been stumped trying to find code that will allow me to append a column to this table that includes the name of the column from the highest value in the row that is not one of the names in the columns GROUP1, GROUP2, GROUP3. The output column would be headed "GROUP4", and would include Brian, Buckley, Chris, Buckley for these four rows.
I've tried looking through dplyr for codes that would fit this problem, but I'm new to this and have been stumped for a bit.
答案1
得分: 0
(possible_names <- select(
ALL,
where(is.numeric)
) |> names())
(excl_names <- select(
ALL,
where(\(x)!is.numeric(x))
) |> names())
ALL2 <- mutate(rowwise(ALL),
cols_to_check = list(setdiff(
possible_names,
c_across(all_of(excl_names))
))
)
(ALL3 <- ALL2 |> mutate(GROUP4 = (\(x){
x[max.col(pick(x))]})(c_across(cols_to_check))
) |> ungroup() |> select(-cols_to_check))
英文:
(possible_names <- select(
ALL,
where(is.numeric)
) |> names())
(excl_names <- select(
ALL,
where(\(x)!is.numeric(x))
) |> names())
ALL2 <- mutate(rowwise(ALL),
cols_to_check = list(setdiff(
possible_names,
c_across(all_of(excl_names))
))
)
(ALL3 <- ALL2 |> mutate(GROUP4 = (\(x){
x[max.col(pick(x))]})(c_across(cols_to_check))
) |> ungroup() |> select(-cols_to_check))
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论