英文:
Reverse the order of non-NA values in an r dataframe
问题
I am looking to reverse the order of numeric values in multiple columns of an R dataframe (so that the highest number becomes the lowest, and so forth), whilst leaving the NA values as they are.
An example of my dataframe:
my_data <- data.frame (animal = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
x = c("1", "2", "1", "3", "NA", 'NA'),
y = c('NA','NA','1','3','2','NA'),
z = c('1','2','3','4','NA','5'),
area = c("field","field","farm","farm","farm","farm"))
and then, what I am trying to achieve:
my_ideal_data <- data.frame (animal = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
x = c("3", "2", "3", "1", "NA", 'NA'),
y = c('NA','NA','3','1','2','NA'),
z = c('5','4','3','2','NA','1'),
area = c("field","field","farm","farm","farm","farm"))
The 'animal' and 'area' column remain the same, as do all of the NAs - but I need the values in x, y, and z to be placed in reverse order for each of the columns.
Any help would be greatly appreciated!
Thank you
英文:
I am looking to reverse the order of numeric values in multiple columns of an R dataframe (so that the highest number becomes the lowest, and so forth), whilst leaving the NA values as they are.
An example of my dataframe:
my_data <- data.frame (animal = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
x = c("1", "2", "1", "3", "NA", 'NA'),
y = c('NA','NA','1','3','2','NA'),
z = c('1','2','3','4','NA','5'),
area = c("field","field","farm","farm","farm","farm"))
and then, what I am trying to achieve:
my_ideal_data <- data.frame (animal = c("fox", "rabbit", "cow", "sheep", "pig", "mole"),
x = c("3", "2", "3", "1", "NA", 'NA'),
y = c('NA','NA','3','1','2','NA'),
z = c('5','4','3','2','NA','1'),
area = c("field","field","farm","farm","farm","farm"))
The 'animal' and 'area' column remain the same, as do all of the NAs - but I need the values in x, y and z to be placed in reverse order for each of the columns.
Any help would be greatly appreciated!
Thank you
答案1
得分: 1
以下是您要翻译的内容:
在这些数据中,您可以在转换为数值后,简单地从z
列中减去6:
my_data$z <- 6 - as.numeric(my_data$z)
# > my_data
# animal x y z area
#1 fox 3 NA 5 field
#2 rabbit 2 NA 4 field
#3 cow 3 3 3 farm
#4 sheep 1 1 2 farm
#5 pig NA 2 NA farm
#6 mole NA NA 1 farm
如果这些示例数据过于简化,另一种方法是使用grep
索引非NA值,然后使用gtools::mixedsort()
按降序值排序,然后使用索引替换这些值。这可能会更具可扩展性,而且您不必转换为数值。
idx <- grep("\\d+", my_data$z)
vals <- gtools::mixedsort(my_data$z[idx], decreasing = TRUE)
my_data$z[idx] <- vals
# animal x y z area
#1 fox 3 NA 5 field
#2 rabbit 2 NA 4 field
#3 cow 3 3 3 farm
#4 sheep 1 1 2 farm
#5 pig NA 2 NA farm
#6 mole NA NA 1 farm
如果您想将其应用于多列,您可以使用lapply
包装它,形成一个函数:
myfun <- function(x){
a <- grep("\\d+", x)
x[a] <- gtools::mixedsort(x[a], decreasing = TRUE)
x
}
my_data[c("x", "y", "z")] <- lapply(my_data[c("x", "y", "z")], myfun)
英文:
In these data, you could simply subtract 6 from the z
column after converting to numeric:
my_data$z <- 6 - as.numeric(my_data$z)
#> my_data
# animal x y z area
#1 fox 3 NA 5 field
#2 rabbit 2 NA 4 field
#3 cow 3 3 3 farm
#4 sheep 1 1 2 farm
#5 pig NA 2 NA farm
#6 mole NA NA 1 farm
An alternative if these sample data are too simplified would be to index the non-NA values using grep
, then sort by decreasing value using gtools::mixedsort()
, then replace those values using [indexing]. This might be a little more scalable, and you dont have to convert to numeric.
idx <- grep("\\d+", my_data$z)
vals <- gtools::mixedsort(my_data$z[idx], decreasing = TRUE)
my_data$z[idx] <- vals
# animal x y z area
#1 fox 3 NA 5 field
#2 rabbit 2 NA 4 field
#3 cow 3 3 3 farm
#4 sheep 1 1 2 farm
#5 pig NA 2 NA farm
#6 mole NA NA 1 farm
If you wanted to apply it to multiple columns, you could wrap it in a functioning use lapply
:
myfun <- function(x){
a <- grep("\\d+", x)
x[a] <- gtools::mixedsort(x[a], decreasing = TRUE)
x
}
my_data[c("x", "y", "z")] <- lapply(my_data[c("x", "y", "z")], myfun)
答案2
得分: 1
你可以在这里使用一个for循环。
首先将所有的"NA"字符替换为实际的NA
。
my_data[my_data == "NA"] <- NA
然后定义一个包含你想要排序的列的向量。
target_col <- c("x", "y", "z")
然后使用for循环遍历目标列,并通过减去max
值+ 1
来进行替换。
my_data[my_data == "NA"] <- NA
target_col <- c("x", "y", "z")
for (i in target_col) {
my_data[!is.na(my_data[,i]),i] <- as.integer(max(my_data[,i], na.rm = T)) + 1 - as.integer(my_data[!is.na(my_data[,i]),i])
}
animal x y z area
1 fox 3 <NA> 5 field
2 rabbit 2 <NA> 4 field
3 cow 3 3 3 farm
4 sheep 1 1 2 farm
5 pig <NA> 2 <NA> farm
6 mole <NA> <NA> 1 farm
英文:
You might use a for loop here.
First replace all "NA" characters into real NA
.
my_data[my_data == "NA"] <- NA
Then define a vector containing the columns that you want to sort.
target_col <- c("x", "y", "z")
And use a for loop to go over the target columns and perform replacement by deducting the column values by the max
values + 1
.
my_data[my_data == "NA"] <- NA
target_col <- c("x", "y", "z")
for (i in target_col) {
my_data[!is.na(my_data[,i]),i] <- as.integer(max(my_data[,i], na.rm = T)) + 1 - as.integer(my_data[!is.na(my_data[,i]),i])
}
animal x y z area
1 fox 3 <NA> 5 field
2 rabbit 2 <NA> 4 field
3 cow 3 3 3 farm
4 sheep 1 1 2 farm
5 pig <NA> 2 <NA> farm
6 mole <NA> <NA> 1 farm
答案3
得分: 0
使用dplyr
和across
函数:
library(dplyr)
Cols <- c("x", "y", "z")
my_data[,Cols] <- Vectorize(\(x) as.numeric(x))(my_data[,Cols])
my_data %>%
mutate(across(!!Cols, ~ max(.x[!is.na(.x)]) - .x + 1))
动物 x y z 区域
1 狐狸 3 NA 5 田地
2 兔子 2 NA 4 田地
3 牛 3 3 3 农场
4 羊 1 1 2 农场
5 猪 NA 2 NA 农场
6 鼹鼠 NA NA 1 农场
<details>
<summary>英文:</summary>
With `dplyr` using `across`
library(dplyr)
Cols <- c("x", "y", "z")
my_data[,Cols] <- Vectorize((x) as.numeric(x))(my_data[,Cols])
my_data %>%
mutate(across(!!Cols, ~ max(.x[!is.na(.x)]) - .x + 1))
animal x y z area
1 fox 3 NA 5 field
2 rabbit 2 NA 4 field
3 cow 3 3 3 farm
4 sheep 1 1 2 farm
5 pig NA 2 NA farm
6 mole NA NA 1 farm
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论