英文:
Reverse the order of non-NA values in an r dataframe
问题
I am looking to reverse the order of numeric values in multiple columns of an R dataframe (so that the highest number becomes the lowest, and so forth), whilst leaving the NA values as they are.
An example of my dataframe:
my_data <- data.frame (animal  = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
                        x = c("1", "2", "1", "3", "NA", 'NA'),
                        y = c('NA','NA','1','3','2','NA'),
                        z = c('1','2','3','4','NA','5'),
                        area = c("field","field","farm","farm","farm","farm"))
and then, what I am trying to achieve:
my_ideal_data <- data.frame (animal  = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
                             x = c("3", "2", "3", "1", "NA", 'NA'),
                             y = c('NA','NA','3','1','2','NA'),
                             z = c('5','4','3','2','NA','1'),
                             area = c("field","field","farm","farm","farm","farm"))
The 'animal' and 'area' column remain the same, as do all of the NAs - but I need the values in x, y, and z to be placed in reverse order for each of the columns.
Any help would be greatly appreciated!
Thank you
英文:
I am looking to reverse the order of numeric values in multiple columns of an R dataframe (so that the highest number becomes the lowest, and so forth), whilst leaving the NA values as they are.
An example of my dataframe:
my_data <- data.frame (animal  = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
                        x = c("1", "2", "1", "3", "NA", 'NA'),
                       y = c('NA','NA','1','3','2','NA'),
                       z = c('1','2','3','4','NA','5'),
                       area = c("field","field","farm","farm","farm","farm"))
and then, what I am trying to achieve:
my_ideal_data <- data.frame (animal  = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
                             x = c("3", "2", "3", "1", "NA", 'NA'),
                             y = c('NA','NA','3','1','2','NA'),
                             z = c('5','4','3','2','NA','1'),
                             area = c("field","field","farm","farm","farm","farm"))
The 'animal' and 'area' column remain the same, as do all of the NAs - but I need the values in x, y and z to be placed in reverse order for each of the columns.
Any help would be greatly appreciated!
Thank you
答案1
得分: 1
以下是您要翻译的内容:
在这些数据中,您可以在转换为数值后,简单地从z列中减去6:
my_data$z <- 6 - as.numeric(my_data$z)  
# > my_data
#  animal  x  y  z  area
#1    fox  3 NA  5 field
#2 rabbit  2 NA  4 field
#3    cow  3  3  3  farm
#4  sheep  1  1  2  farm
#5    pig NA  2 NA  farm
#6   mole NA NA  1  farm
如果这些示例数据过于简化,另一种方法是使用grep索引非NA值,然后使用gtools::mixedsort()按降序值排序,然后使用索引替换这些值。这可能会更具可扩展性,而且您不必转换为数值。
idx <- grep("\\d+", my_data$z)
vals <- gtools::mixedsort(my_data$z[idx], decreasing = TRUE)
my_data$z[idx] <- vals
#  animal  x  y  z  area
#1    fox  3 NA  5 field
#2 rabbit  2 NA  4 field
#3    cow  3  3  3  farm
#4  sheep  1  1  2  farm
#5    pig NA  2 NA  farm
#6   mole NA NA  1  farm
如果您想将其应用于多列,您可以使用lapply包装它,形成一个函数:
myfun <- function(x){
  a <- grep("\\d+", x)
  x[a] <- gtools::mixedsort(x[a], decreasing = TRUE)
  x
}
my_data[c("x", "y", "z")] <- lapply(my_data[c("x", "y", "z")], myfun)
英文:
In these data, you could simply subtract 6 from the z column after converting to numeric:
my_data$z <- 6 - as.numeric(my_data$z)  
#> my_data
#  animal  x  y  z  area
#1    fox  3 NA  5 field
#2 rabbit  2 NA  4 field
#3    cow  3  3  3  farm
#4  sheep  1  1  2  farm
#5    pig NA  2 NA  farm
#6   mole NA NA  1  farm
An alternative if these sample data are too simplified would be to index the non-NA values using grep, then sort by decreasing value using gtools::mixedsort(), then replace those values using [indexing]. This might be a little more scalable, and you dont have to convert to numeric.
idx <- grep("\\d+", my_data$z)
vals <- gtools::mixedsort(my_data$z[idx], decreasing = TRUE)
my_data$z[idx] <- vals
#  animal  x  y  z  area
#1    fox  3 NA  5 field
#2 rabbit  2 NA  4 field
#3    cow  3  3  3  farm
#4  sheep  1  1  2  farm
#5    pig NA  2 NA  farm
#6   mole NA NA  1  farm
If you wanted to apply it to multiple columns, you could wrap it in a functioning use lapply:
myfun <- function(x){
  a <-  grep("\\d+", x)
  x[a] <- gtools::mixedsort(x[a], decreasing = TRUE)
  x
}
my_data[c("x", "y", "z")] <- lapply(my_data[c("x", "y", "z")], myfun)
答案2
得分: 1
你可以在这里使用一个for循环。
首先将所有的"NA"字符替换为实际的NA。
my_data[my_data == "NA"] <- NA
然后定义一个包含你想要排序的列的向量。
target_col <- c("x", "y", "z")
然后使用for循环遍历目标列,并通过减去max值+ 1来进行替换。
my_data[my_data == "NA"] <- NA
target_col <- c("x", "y", "z")
for (i in target_col) {
  my_data[!is.na(my_data[,i]),i] <- as.integer(max(my_data[,i], na.rm = T)) + 1 - as.integer(my_data[!is.na(my_data[,i]),i])
}
  animal    x    y    z  area
1    fox    3 <NA>    5 field
2 rabbit    2 <NA>    4 field
3    cow    3    3    3  farm
4  sheep    1    1    2  farm
5    pig <NA>    2 <NA>  farm
6   mole <NA> <NA>    1  farm
英文:
You might use a for loop here.
First replace all "NA" characters into real NA.
my_data[my_data == "NA"] <- NA
Then define a vector containing the columns that you want to sort.
target_col <- c("x", "y", "z")
And use a for loop to go over the target columns and perform replacement by deducting the column values by the max values + 1.
my_data[my_data == "NA"] <- NA
target_col <- c("x", "y", "z")
for (i in target_col) {
  my_data[!is.na(my_data[,i]),i] <- as.integer(max(my_data[,i], na.rm = T)) + 1 - as.integer(my_data[!is.na(my_data[,i]),i])
}
  animal    x    y    z  area
1    fox    3 <NA>    5 field
2 rabbit    2 <NA>    4 field
3    cow    3    3    3  farm
4  sheep    1    1    2  farm
5    pig <NA>    2 <NA>  farm
6   mole <NA> <NA>    1  farm
答案3
得分: 0
使用dplyr和across函数:
library(dplyr)
Cols <- c("x", "y", "z")
my_data[,Cols] <- Vectorize(\(x) as.numeric(x))(my_data[,Cols])
my_data %>%
  mutate(across(!!Cols, ~ max(.x[!is.na(.x)]) - .x + 1))
动物  x  y  z  区域
1    狐狸  3 NA  5  田地
2    兔子  2 NA  4  田地
3    牛    3  3  3  农场
4    羊    1  1  2  农场
5    猪   NA  2 NA  农场
6    鼹鼠 NA NA  1  农场
<details>
<summary>英文:</summary>
With `dplyr` using `across`
library(dplyr)
Cols <- c("x", "y", "z")
my_data[,Cols] <- Vectorize((x) as.numeric(x))(my_data[,Cols])
my_data %>%
mutate(across(!!Cols, ~ max(.x[!is.na(.x)]) - .x + 1))
animal  x  y  z  area
1    fox  3 NA  5 field
2 rabbit  2 NA  4 field
3    cow  3  3  3  farm
4  sheep  1  1  2  farm
5    pig NA  2 NA  farm
6   mole NA NA  1  farm
</details>
				通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论