英文:
Merge two data frames if a list of string matches and list the unmatch string as NA R
问题
我有两个数据框,并且如果Drug_Name和Drugz的字符串匹配,我想要在同一行中为第二个数据框分配或粘贴字符串。
注意:
如果我使用left_join或merge merge(df1, df2, all.x = TRUE, by.x = "Drug_Names", by.y = "Drugz")
进行操作,它会返回与DF1类似的完全相同的数据框。
以下是所需DF的示例:
DF1
> Pat_ID Date Code Drug_Names
> AB1 2010-12-09 1.1.1 Alpha
> AB1 2010-12-15 1.1.1 Alpha
> AB1 2010-12-15 1.1.1 Beta
> Ax2 2010-12-09 1.1.1 Beta
> Ax2 2010-12-17 1.1.1 Beta
> Aq3 2011-02-09 1.1.1 Gamma
> Aq3 2011-04-25 1.1.1 Gamma
> Aw4 2011-04-25 1.1.1 Tango
DF2
Codez Drugz
1.1.1 Alpha
1.1.3 Gamma
所需DF3
Pat_ID Date Code Drug_Names Drugz Codez
> AB1 2010-12-09 1.1.1 Alpha Alpha 1.1.1
> AB1 2010-12-15 1.1.1 Alpha Alpha 1.1.1
> AB1 2010-12-15 1.1.1 Beta NA NA
> Ax2 2010-12-09 1.1.1 Beta NA NA
> Ax2 2010-12-17 1.1.1 Beta NA NA
> Aq3 2011-02-09 1.1.1 Gamma Gamma 1.1.3
> Aq3 2011-04-25 1.1.1 Gamma Gamma 1.1.3
> Aw4 2011-04-25 1.1.1 Tango NA NA
英文:
I have two data frames and I want to assign or paste the string of second data frame in the same row if the string of Drug_Name and Drugz matches.
NOTE:
If I do it by left_join or merge merge(df1, df2, all.x = TRUE, by.x = "Drug_Names", by.y = "Drugz")
, it returns me the exact same dataframe that is similar to DF1.
Below is the example of with required DF
DF1
> Pat_ID Date Code Drug_Names
> AB1 2010-12-09 1.1.1 Alpha
> AB1 2010-12-15 1.1.1 Alpha
> AB1 2010-12-15 1.1.1 Beta
> Ax2 2010-12-09 1.1.1 Beta
> Ax2 2010-12-17 1.1.1 Beta
> Aq3 2011-02-09 1.1.1 Gamma
> Aq3 2011-04-25 1.1.1 Gamma
> Aw4 2011-04-25 1.1.1 Tango
DF2
Codez Drugz
1.1.1 Alpha
1.1.3 Gamma
Required DF3
> Pat_ID Date Code Drug_Names Drugz Codez
> AB1 2010-12-09 1.1.1 Alpha Alpha 1.1.1
> AB1 2010-12-15 1.1.1 Alpha Alpha 1.1.1
> AB1 2010-12-15 1.1.1 Beta NA NA
> Ax2 2010-12-09 1.1.1 Beta NA NA
> Ax2 2010-12-17 1.1.1 Beta NA NA
> Aq3 2011-02-09 1.1.1 Gamma Gamma 1.1.3
> Aq3 2011-04-25 1.1.1 Gamma Gamma 1.1.3
> Aw4 2011-04-25 1.1.1 Tango NA NA
答案1
得分: 1
df1 <- read.table(text = "
Pat_ID Date Code Drug_Names
AB1 2010-12-09 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Beta
Ax2 2010-12-09 1.1.1 Beta
Ax2 2010-12-17 1.1.1 Beta
Aq3 2011-02-09 1.1.1 Gamma
Aq3 2011-04-25 1.1.1 Gamma
Aw4 2011-04-25 1.1.1 Tango", header = TRUE, stringsAsFactors = FALSE) %>%
as_tibble() %>%
mutate(Date = as.Date(Date))
df2 <- read.table(text = "
Codez Drugz
1.1.1 Alpha
1.1.3 Gamma", header = TRUE, stringsAsFactors = FALSE) %>%
as_tibble()
df1 %>%
left_join(df2, by = c("Code" = "Codez", "Drug_Names" = "Drugz"), keep = TRUE)
# A tibble: 8 × 6
Pat_ID Date Code Drug_Names Codez Drugz
<chr> <date> <chr> <chr> <chr> <chr>
1 AB1 2010-12-09 1.1.1 Alpha 1.1.1 Alpha
2 AB1 2010-12-15 1.1.1 Alpha 1.1.1 Alpha
3 AB1 2010-12-15 1.1.1 Beta NA NA
4 Ax2 2010-12-09 1.1.1 Beta NA NA
5 Ax2 2010-12-17 1.1.1 Beta NA NA
6 Aq3 2011-02-09 1.1.1 Gamma NA NA
7 Aq3 2011-04-25 1.1.1 Gamma NA NA
8 Aw4 2011-04-25 1.1.1 Tango NA NA
英文:
df1 <- read.table(text = "
Pat_ID Date Code Drug_Names
AB1 2010-12-09 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Beta
Ax2 2010-12-09 1.1.1 Beta
Ax2 2010-12-17 1.1.1 Beta
Aq3 2011-02-09 1.1.1 Gamma
Aq3 2011-04-25 1.1.1 Gamma
Aw4 2011-04-25 1.1.1 Tango", header = TRUE, stringsAsFactors = FALSE) %>%
as_tibble() %>%
mutate(Date = as.Date(Date))
df2 <- read.table(text = "
Codez Drugz
1.1.1 Alpha
1.1.3 Gamma", header = TRUE, stringsAsFactors = FALSE) %>%
as_tibble()
df1 %>%
left_join(df2, by = c("Code" = "Codez", "Drug_Names" = "Drugz"), keep = TRUE)
# A tibble: 8 × 6
Pat_ID Date Code Drug_Names Codez Drugz
<chr> <date> <chr> <chr> <chr> <chr>
1 AB1 2010-12-09 1.1.1 Alpha 1.1.1 Alpha
2 AB1 2010-12-15 1.1.1 Alpha 1.1.1 Alpha
3 AB1 2010-12-15 1.1.1 Beta NA NA
4 Ax2 2010-12-09 1.1.1 Beta NA NA
5 Ax2 2010-12-17 1.1.1 Beta NA NA
6 Aq3 2011-02-09 1.1.1 Gamma NA NA
7 Aq3 2011-04-25 1.1.1 Gamma NA NA
8 Aw4 2011-04-25 1.1.1 Tango NA NA
答案2
得分: 1
你可以使用 match
。
cbind(DF1, DF2[match(DF1$Drug_Names, DF2$Drugz),2:1])
# Pat_ID Date Code Drug_Names Drugz Codez
#1 AB1 2010-12-09 1.1.1 Alpha Alpha 1.1.1
#1.1 AB1 2010-12-15 1.1.1 Alpha Alpha 1.1.1
#NA AB1 2010-12-15 1.1.1 Beta <NA> <NA>
#NA.1 Ax2 2010-12-09 1.1.1 Beta <NA> <NA>
#NA.2 Ax2 2010-12-17 1.1.1 Beta <NA> <NA>
#2 Aq3 2011-02-09 1.1.1 Gamma Gamma 1.1.3
#2.1 Aq3 2011-04-25 1.1.1 Gamma Gamma 1.1.3
#NA.3 Aw4 2011-04-25 1.1.1 Tango <NA> <NA>
或者如果使用 merge
,可以将列 Drug_Names
添加到 DF2。
merge(DF1, cbind(Drug_Names = DF2$Drugz, DF2), all.x = TRUE)
# Drug_Names Pat_ID Date Code Codez Drugz
#1 Alpha AB1 2010-12-09 1.1.1 1.1.1 Alpha
#2 Alpha AB1 2010-12-15 1.1.1 1.1.1 Alpha
#3 Beta AB1 2010-12-15 1.1.1 <NA> <NA>
#4 Beta Ax2 2010-12-09 1.1.1 <NA> <NA>
#5 Beta Ax2 2010-12-17 1.1.1 <NA> <NA>
#6 Gamma Aq3 2011-02-09 1.1.1 1.1.3 Gamma
#7 Gamma Aq3 2011-04-25 1.1.1 1.1.3 Gamma
#8 Tango Aw4 2011-04-25 1.1.1 <NA> <NA>
数据
DF1 <- read.table(header=TRUE, text="Pat_ID Date Code Drug_Names
AB1 2010-12-09 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Beta
Ax2 2010-12-09 1.1.1 Beta
Ax2 2010-12-17 1.1.1 Beta
Aq3 2011-02-09 1.1.1 Gamma
Aq3 2011-04-25 1.1.1 Gamma
Aw4 2011-04-25 1.1.1 Tango")
DF2 <- read.table(header=TRUE, text="Codez Drugz
1.1.1 Alpha
1.1.3 Gamma")
英文:
You can use match
.
cbind(DF1, DF2[match(DF1$Drug_Names, DF2$Drugz),2:1])
# Pat_ID Date Code Drug_Names Drugz Codez
#1 AB1 2010-12-09 1.1.1 Alpha Alpha 1.1.1
#1.1 AB1 2010-12-15 1.1.1 Alpha Alpha 1.1.1
#NA AB1 2010-12-15 1.1.1 Beta <NA> <NA>
#NA.1 Ax2 2010-12-09 1.1.1 Beta <NA> <NA>
#NA.2 Ax2 2010-12-17 1.1.1 Beta <NA> <NA>
#2 Aq3 2011-02-09 1.1.1 Gamma Gamma 1.1.3
#2.1 Aq3 2011-04-25 1.1.1 Gamma Gamma 1.1.3
#NA.3 Aw4 2011-04-25 1.1.1 Tango <NA> <NA>
Or in case using merge
add the column Drug_Names
to DF2.
merge(DF1, cbind(Drug_Names = DF2$Drugz, DF2), all.x = TRUE)
# Drug_Names Pat_ID Date Code Codez Drugz
#1 Alpha AB1 2010-12-09 1.1.1 1.1.1 Alpha
#2 Alpha AB1 2010-12-15 1.1.1 1.1.1 Alpha
#3 Beta AB1 2010-12-15 1.1.1 <NA> <NA>
#4 Beta Ax2 2010-12-09 1.1.1 <NA> <NA>
#5 Beta Ax2 2010-12-17 1.1.1 <NA> <NA>
#6 Gamma Aq3 2011-02-09 1.1.1 1.1.3 Gamma
#7 Gamma Aq3 2011-04-25 1.1.1 1.1.3 Gamma
#8 Tango Aw4 2011-04-25 1.1.1 <NA> <NA>
Data
DF1 <- read.table(header=TRUE, text="Pat_ID Date Code Drug_Names
AB1 2010-12-09 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Alpha
AB1 2010-12-15 1.1.1 Beta
Ax2 2010-12-09 1.1.1 Beta
Ax2 2010-12-17 1.1.1 Beta
Aq3 2011-02-09 1.1.1 Gamma
Aq3 2011-04-25 1.1.1 Gamma
Aw4 2011-04-25 1.1.1 Tango")
DF2 <- read.table(header=TRUE, text="Codez Drugz
1.1.1 Alpha
1.1.3 Gamma")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论