英文:
How can I identify and turn roman numbers into integers in "mixed" observations in R?
问题
我有一个包含混合字符(单词)和罗马数字的观察的数据框。它还包含整数、仅字符(例如"Apple")和NA值,但我希望保持它们不变。
所以观察如下:
x <- data.frame(col = c("15", "NA", "0", "Red", "iv", "Logic", "ix. Sweet", "VIII - Apple",
"Big XVI", "WeirdVII", "XI: Small"))
我想要的是将每个包含罗马数字的观察(即使它们与单词混合在一起)转换为整数。因此,根据示例,结果数据框将如下所示:
15
NA
0
Red
4
Logic
9
8
16
7
11
有没有办法做到这一点?
我尝试过的是:
library(stringr)
library(gtools)
roman <- str_extract(x$col, "([IVXivx]+)")
roman_to_int <- roman2int(roman)
x$col <- ifelse(!is.na(roman_to_int), roman_to_int, x$col)
但这不起作用,因为字符观察也被转换为罗马数字,比如"Logic"被转换为"1"。我想避免这种情况。
英文:
I have a data frame with a column that contains observations that mix characters (words) and roman numbers. It also has integers, only characters (like the observation "Apple"), and NA's, but I want to leave them unchanged.
So it has observations like:
x <- data.frame(col = c("15", "NA", "0", "Red", "iv", "Logic", "ix. Sweet", "VIII - Apple",
"Big XVI", "WeirdVII", "XI: Small"))
What I want is to turn every observation that has a roman number (even the ones that are mixed with words), and turn them into integers. So, following the example, the resulting data frame would be like:
15
NA
0
Red
4
Logic
9
8
16
7
11
Is there any way to do this?
What I have attempted is:
library(stringr)
library(gtools)
roman <- str_extract(x$col, "([IVXivx]+)")
roman_to_int <- roman2int(roman)
x$col <- ifelse(!is.na(roman_to_int), roman_to_int, x$col)
However, this does not work because the observations that are character but do not include roman integers are also turned into roman numbers, like the one "Logic" turns as "1". I want to avoid this.
答案1
得分: 2
pat <- "[IVXLCDM]{2,}|\b[ivxlcdm]+\b|\b[IVXLCDM]+\b"
str_replace_all(x$col, pat, gtools::roman2int)
[1] "15" "NA" "0" "Red" "4"
[6] "Logic" "9. Sweet" "8 - Apple" "Big 16" "Weird7"
[11] "11: Small"
英文:
pat <- "[IVXLCDM]{2,}|\\b[ivxlcdm]+\\b|\\b[IVXLCDM]+\\b"
str_replace_all(x$col,pat, gtools::roman2int)
[1] "15" "NA" "0" "Red" "4"
[6] "Logic" "9. Sweet" "8 - Apple" "Big 16" "Weird7"
[11] "11: Small"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论