英文:
r match and combine into a dataframe multiple vectors of different lengths
问题
结果:
x y z
[1,] "1" "1" NA
[2,] "2" "2" NA
[3,] NA "3" "3"
[4,] NA "4" "4"
[5,] NA "5" "5"
[6,] NA "6" "6"
[7,] "a" NA "a"
[8,] "b" "b" NA
英文:
I am probably missing something very obvious, but I can't seem to find a way to do this. I would like to will merge multiple vectors (or dataframes?) of different lengths into a dataframe by matching values of vector elements with each other and putting them into same row positions, filling rows left empty with NAs. I have tried the solution from qpcR (cbind.na) but it doesn't produce expected outcome.
reproducible example:
x<-c("1","2","a","b")
y<-c("1","2","3","4","5","6","b")
z<-c("3","4","5","6","a")
expected output:
x y z
[1,] 1 1 NA
[2,] 2 2 NA
[3,] NA 3 3
[4,] NA 4 4
[5,] NA 5 5
[6,] NA 6 6
[7,] a NA a
[8,] b b NA
答案1
得分: 2
以下是翻译好的内容:
这是一个笨拙但有效的解决方案。请注意,行顺序与您的要求不匹配:
x <- c("1", "2", "a", "b")
y <- c("1", "2", "3", "4", "5", "6", "b")
z <- c("3", "4", "5", "6", "a")
tot <- unique(c(x, y, z))
# 给你一个包含所有向量中所有唯一值的列表
df <- data.frame(
x = rep(NA, times = length(tot)),
y = NA,
z = NA
)
# 准备一个包含所有NA值的数据框
df$x[tot %in% x] <- tot[tot %in% x]
df$y[tot %in% y] <- tot[tot %in% y]
df$z[tot %in% z] <- tot[tot %in% z]
# 如果在“父”向量中存在匹配值,则填充NA值。
结果:
> df
x y z
1 1 1 <NA>
2 2 2 <NA>
3 a <NA> a
4 b b <NA>
5 <NA> 3 3
6 <NA> 4 4
7 <NA> 5 5
8 <NA> 6 6
英文:
Here's a clumsy but working solution. Note the row order does not match your request though:
x<-c("1","2","a","b")
y<-c("1","2","3","4","5","6","b")
z<-c("3","4","5","6","a")
tot <- unique(c(x,y,z))
# gives you a list of all unique values across all your vectors
df <- data.frame(
x = rep(NA, times = length(tot)),
y = NA,
z = NA
)
# prepare a data frame with all NAs
df$x[tot %in% x] <- tot[tot %in% x]
df$y[tot %in% y] <- tot[tot %in% y]
df$z[tot %in% z] <- tot[tot %in% z]
# fills in the NAs with the matching value if present in the 'parent' vector.
Gives:
> df
x y z
1 1 1 <NA>
2 2 2 <NA>
3 a <NA> a
4 b b <NA>
5 <NA> 3 3
6 <NA> 4 4
7 <NA> 5 5
8 <NA> 6 6
答案2
得分: 2
以下是翻译好的部分:
l <- list(x = x, y = y, z = z)
dat <- data.frame(
unique_vals = sort(unique(unlist(l)))
)
dat[names(l)] <- lapply(l, \(x) {
x[match(dat$unique_vals, x)]
})
# unique_vals x y z
# 1 1 1 1 <NA>
# 2 2 2 2 <NA>
# 3 3 <NA> 3 3
# 4 4 <NA> 4 4
# 5 5 <NA> 5 5
# 6 6 <NA> 6 6
# 7 a a <NA> a
# 8 b b b <NA>
我保留了unique_vals
列以便清楚地了解操作,但你可能想要将其删除。
英文:
You could try this. It is similar to the answer by Paul Stafford Allen in that it starts with the unique values. I've put the vectors in a list to allow for easy iteration, so it is straightforward to extend to more columns.
l <- list(x = x, y = y, z = z)
dat <- data.frame(
unique_vals = sort(unique(unlist(l)))
)
dat[names(l)] <- lapply(l, \(x) {
x[match(dat$unique_vals, x)]
})
# unique_vals x y z
# 1 1 1 1 <NA>
# 2 2 2 2 <NA>
# 3 3 <NA> 3 3
# 4 4 <NA> 4 4
# 5 5 <NA> 5 5
# 6 6 <NA> 6 6
# 7 a a <NA> a
# 8 b b b <NA>
I kept the unique_vals
column so it's clear what's going on but you may want to remove it.
答案3
得分: 1
你可以在Reduce
中使用merge
,并按新集合的行名称进行匹配。
l <- lapply(list(x=x, y=y, z=z), \(a) setNames(a, make.unique(a)))
setNames(
Reduce(\(a, b) {. <- merge(a, b, by=0, all=TRUE)
`row.names<-`(.[-1], .[,1])}, l), names(l))
# x y z
#1 1 1 <NA>
#2 2 2 <NA>
#3 <NA> 3 3
#4 <NA> 4 4
#5 <NA> 5 5
#6 <NA> 6 6
#a a <NA> a
#b b b <NA>
这也适用于向量中一个值出现多次的情况。
x<-c("1","1","2","a","b")
y<-c("1","2","3","4","5","6","b")
z<-c("3","4","5","6","a")
l <- lapply(list(x=x, y=y, z=z), \(a) setNames(a, make.unique(a)))
setNames(
Reduce(\(a, b) {. <- merge(a, b, by=0, all=TRUE)
`row.names<-`(.[-1], .[,1])}, l), names(l))
# x y z
#1 1 1 <NA>
#1.1 1 <NA> <NA>
#2 2 2 <NA>
#3 <NA> 3 3
#4 <NA> 4 4
#5 <NA> 5 5
#6 <NA> 6 6
#a a <NA> a
#b b b <NA>
或者使用match
。
x <- c("1","1","2","a","b")
y <- c("1","2","3","4","5","6","b")
z <- c("3","4","5","6","a")
l <- list(x=x, y=y, z=z)
u <- lapply(l, make.unique)
k <- unique(unlist(u))
mapply(\(l, u) l[match(k, u)], l, u)
# x y z
# [1,] "1" "1" NA
# [2,] "1" NA NA
# [3,] "2" "2" NA
# [4,] "a" NA "a"
# [5,] "b" "b" NA
# [6,] NA "3" "3"
# [7,] NA "4" "4"
# [8,] NA "5" "5"
# [9,] NA "6" "6"
英文:
You can use merge
in Reduce
and match by the new set row.names.
l <- lapply(list(x=x, y=y, z=z), \(a) setNames(a, make.unique(a)))
setNames(
Reduce(\(a, b) {. <- merge(a, b, by=0, all=TRUE)
`row.names<-`(.[-1], .[,1])}, l), names(l))
# x y z
#1 1 1 <NA>
#2 2 2 <NA>
#3 <NA> 3 3
#4 <NA> 4 4
#5 <NA> 5 5
#6 <NA> 6 6
#a a <NA> a
#b b b <NA>
This will also work in case a value is more than one time present in a vector.
x<-c("1","1","2","a","b")
y<-c("1","2","3","4","5","6","b")
z<-c("3","4","5","6","a")
l <- lapply(list(x=x, y=y, z=z), \(a) setNames(a, make.unique(a)))
setNames(
Reduce(\(a, b) {. <- merge(a, b, by=0, all=TRUE)
`row.names<-`(.[-1], .[,1])}, l), names(l))
# x y z
#1 1 1 <NA>
#1.1 1 <NA> <NA>
#2 2 2 <NA>
#3 <NA> 3 3
#4 <NA> 4 4
#5 <NA> 5 5
#6 <NA> 6 6
#a a <NA> a
#b b b <NA>
Or using match
.
x <- c("1","1","2","a","b")
y <- c("1","2","3","4","5","6","b")
z <- c("3","4","5","6","a")
l <- list(x=x, y=y, z=z)
u <- lapply(l, make.unique)
k <- unique(unlist(u))
mapply(\(l, u) l[match(k, u)], l, u)
# x y z
# [1,] "1" "1" NA
# [2,] "1" NA NA
# [3,] "2" "2" NA
# [4,] "a" NA "a"
# [5,] "b" "b" NA
# [6,] NA "3" "3"
# [7,] NA "4" "4"
# [8,] NA "5" "5"
# [9,] NA "6" "6"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论