英文:
how to merge two columns with prioritization R
问题
你可以使用Python的pandas库来合并这两列。以下是一个示例代码:
import pandas as pd
# 创建数据框
data = {'MANUAL.ID': ['Barbar', '', 'Barbar', 'Pippip,Hypsav', 'Pippip', 'Barbar,Pippip'],
'AUTO.ID': ['Barbar', 'Barbar', 'Pippip', 'Barbar', 'Barbar', '']}
df = pd.DataFrame(data)
# 合并两列,优先使用 MANUAL.ID 列的值
df['species'] = df['MANUAL.ID'].fillna(df['AUTO.ID'])
# 删除原始列
df = df[['species']]
# 打印结果
print(df)
这将给你所期望的结果。
英文:
I have a data frame which looks like that :
MANUAL.ID | AUTO.ID |
---|---|
Barbar | Barbar |
Barbar | |
Barbar | Pippip |
Pippip,Hypsav | Barbar |
Pippip | Barbar |
Barbar,Pippip |
Basically I whould merge the both columns in a single called species and obtain this kind of result and we always prioritize the manual.id column :
species |
---|
Barbar |
Barbar |
Barbar |
Pippip,Hypsav |
Pippip |
Barbar,Pippip |
What can I do to get the result ?
答案1
得分: 1
You could use the dplyr
中的 coalesce
函数来处理 NA 值,例如:
library(dplyr)
df %>%
mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
select(species)
#> species
#> 1 Barbar
#> 2 Barbar
#> 3 Barbar
#> 4 Pippip,Hypsav
#> 5 Pippip
#> 6 Barbar,Pippip
如果你的单元格是空字符串,你可以首先使用 na_if
将它们替换为 NA,像这样:
library(dplyr)
library(tidyr)
df %>%
mutate(across(everything(), ~ na_if(.x, ""))) %>%
mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
select(species)
#> species
#> 1 Barbar
#> 2 Barbar
#> 3 Barbar
#> 4 Pippip,Hypsav
#> 5 Pippip
#> 6 Barbar,Pippip
创建于 2023-04-19,使用 reprex v2.0.2
英文:
You could use the coalesce
function from dplyr
if your values are NA like this:
library(dplyr)
df %>%
mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
select(species)
#> species
#> 1 Barbar
#> 2 Barbar
#> 3 Barbar
#> 4 Pippip,Hypsav
#> 5 Pippip
#> 6 Barbar,Pippip
If your cells are empty strings you could first replace them to NA with na_if
like this:
library(dplyr)
library(tidyr)
df %>%
mutate(across(everything(), ~ na_if(.x, ""))) %>%
mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
select(species)
#> species
#> 1 Barbar
#> 2 Barbar
#> 3 Barbar
#> 4 Pippip,Hypsav
#> 5 Pippip
#> 6 Barbar,Pippip
<sup>Created on 2023-04-19 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论