如何使用R合并两列并进行优先级排序

huangapple go评论75阅读模式
英文:

how to merge two columns with prioritization R

问题

你可以使用Python的pandas库来合并这两列。以下是一个示例代码:

import pandas as pd

# 创建数据框
data = {'MANUAL.ID': ['Barbar', '', 'Barbar', 'Pippip,Hypsav', 'Pippip', 'Barbar,Pippip'],
        'AUTO.ID': ['Barbar', 'Barbar', 'Pippip', 'Barbar', 'Barbar', '']}

df = pd.DataFrame(data)

# 合并两列,优先使用 MANUAL.ID 列的值
df['species'] = df['MANUAL.ID'].fillna(df['AUTO.ID'])

# 删除原始列
df = df[['species']]

# 打印结果
print(df)

这将给你所期望的结果。

英文:

I have a data frame which looks like that :

MANUAL.ID AUTO.ID
Barbar Barbar
Barbar
Barbar Pippip
Pippip,Hypsav Barbar
Pippip Barbar
Barbar,Pippip

Basically I whould merge the both columns in a single called species and obtain this kind of result and we always prioritize the manual.id column :

species
Barbar
Barbar
Barbar
Pippip,Hypsav
Pippip
Barbar,Pippip

What can I do to get the result ?

答案1

得分: 1

You could use the dplyr 中的 coalesce 函数来处理 NA 值,例如:

library(dplyr)
df %>%
  mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
  select(species)
#>         species
#> 1        Barbar
#> 2        Barbar
#> 3        Barbar
#> 4 Pippip,Hypsav
#> 5        Pippip
#> 6 Barbar,Pippip

如果你的单元格是空字符串,你可以首先使用 na_if 将它们替换为 NA,像这样:

library(dplyr)
library(tidyr)
df %>%
  mutate(across(everything(), ~ na_if(.x, ""))) %>%
  mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
  select(species)
#>         species
#> 1        Barbar
#> 2        Barbar
#> 3        Barbar
#> 4 Pippip,Hypsav
#> 5        Pippip
#> 6 Barbar,Pippip

创建于 2023-04-19,使用 reprex v2.0.2

英文:

You could use the coalesce function from dplyr if your values are NA like this:

library(dplyr)
df %>%
  mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
  select(species)
#>         species
#> 1        Barbar
#> 2        Barbar
#> 3        Barbar
#> 4 Pippip,Hypsav
#> 5        Pippip
#> 6 Barbar,Pippip

If your cells are empty strings you could first replace them to NA with na_if like this:

library(dplyr)
library(tidyr)
df %>%
  mutate(across(everything(), ~ na_if(.x, ""))) %>%
  mutate(species = coalesce(MANUAL.ID, AUTO.ID)) %>%
  select(species)
#>         species
#> 1        Barbar
#> 2        Barbar
#> 3        Barbar
#> 4 Pippip,Hypsav
#> 5        Pippip
#> 6 Barbar,Pippip

<sup>Created on 2023-04-19 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年4月19日 22:11:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055538.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定