2023年7月10日 22:01:43go评论91阅读模式

英文:

Using strsplit() to split up a numeric string and replace part with characters

问题

我有一个字符串，我想将其拆分并将最后的两个数字替换为字符。例如，字符串"1-1-2-2"将变为"1-1-B-B"。我已经包括了我尝试的代码片段和我的尝试，希望这样更清楚。

> df
num
1-1-26-2
1-2-2-4
1-2-4-5
1-3-25-1

现在我已经尝试使用strsplit(num, '-')来拆分旧的num列，但不确定如何使用下面的替换df来替换最后的两个数字。

> replacement_df
character    num
A            1
B            2
D            4
E            5
Y            25
Z            26

英文:

I have a string which I want to split up and replace the last 2 numbers with characters. So for example a string of "1-1-2-2" would become "1-1-B-B". I have included a snippet of what I'm trying to do and my attempt so far and hopefully it becomes a bit clearer.

&gt; df
num
1-1-26-2
1-2-2-4
1-2-4-5
1-3-25-1

So now I have attempted to split up the old_num column using strsplit(num, '-') but unsure of how to replace the last 2 digits with the characters using the replacement df from below

&gt; replacement_df
character    num
A            1
B            2
D            4
E            5
Y            25
Z            26

答案1

得分: 2

像这样吗？
```r
replace_nums <- function(x, n = 2) {
    x_split <- unlist(strsplit(x, "-"))
    x_tail <- tail(x_split, n)
    paste(c(
        head(x_split, -n),
        LETTERS[as.integer(x_tail)]
    ), collapse = "-")
}
x <- c("1-1-2-2")
replace_nums(x)
# [1] "1-1-B-B"

或者对于矢量化版本：

replace_nums_df <- function(x, n = 2) {
    x_split <- strsplit(x, "-")
    x_tail <- lapply(x_split, \(x) tail(x, n))
    Map(\(split_str, tail_str) {
        paste(c(
            head(split_str, -n),
            LETTERS[as.integer(tail_str)]
        ), collapse = "-")
    }, x_split, x_tail)
}
df$replaced <- replace_nums_df(df$num)
df
#        num replaced
# 1 1-1-26-2  1-1-Z-B
# 2  1-2-2-4  1-2-B-D
# 3  1-2-4-5  1-2-D-E
# 4 1-3-25-1  1-3-Y-A


<details>
<summary>英文:</summary>
Something like this?
```r
replace_nums &lt;- function(x, n = 2) {
    x_split &lt;- unlist(strsplit(x, &quot;-&quot;))
    x_tail &lt;- tail(x_split, n)
    paste(c(
        head(x_split, -n),
        LETTERS[as.integer(x_tail)]
    ), collapse = &quot;-&quot;)
}
x &lt;- c(&quot;1-1-2-2&quot;)
replace_nums(x)
# [1] &quot;1-1-B-B&quot;

Or for a vectorised version:

replace_nums_df &lt;- function(x, n = 2) {
    x_split &lt;- strsplit(x, &quot;-&quot;)
    x_tail &lt;- lapply(x_split, \(x) tail(x, n))
    Map(\(split_str, tail_str) {
        paste(c(
            head(split_str, -n),
            LETTERS[as.integer(tail_str)]
        ), collapse = &quot;-&quot;)
    }, x_split, x_tail)
}
df$replaced &lt;- replace_nums_df(df$num)
df
#        num replaced
# 1 1-1-26-2  1-1-Z-B
# 2  1-2-2-4  1-2-B-D
# 3  1-2-4-5  1-2-D-E
# 4 1-3-25-1  1-3-Y-A

答案2

得分: 2

1. `stringr` 解决方案

在 str_replace_all() 中提供一个自定义函数，以替换最后2个数字的匹配。

library(dplyr)
library(stringr)
df %>%
  mutate(num_new = str_replace_all(num, "\\d+-\\d+$", \(x) {
    str_c(LETTERS[as.integer(str_split_1(x, '-'))], collapse = '-')
  }))

2. `tidyr` 解决方案

separate_wider_regex() + unite()

library(dplyr)
library(tidyr)
df %>%
  separate_wider_regex(
    num,
    patterns = c(col1 = ".+", "-", col2 = "\\d+", "-", col3 = "\\d+"),
    cols_remove = FALSE
  ) %>%
  mutate(across(col2:col3, ~ LETTERS[as.integer(.x)])) %>%
  unite(num_new, col1:col3, sep = '-')

输出

# # A tibble: 4 × 2
#   num      num_new
#   <chr>    <chr>  
# 1 1-1-26-2 1-1-Z-B
# 2 1-2-2-4  1-2-B-D
# 3 1-2-4-5  1-2-D-E
# 4 1-3-25-1 1-3-Y-A

对于一般情况，即列中的字符串不都包含相同数量的数字。

df <- data.frame(num = c("1-2-3", "1-2-3-4", "1-2-3-4-5"))

上述两种解决方案都可以处理这种情况：

#         num   num_new
# 1     1-2-3     1-B-C
# 2   1-2-3-4   1-2-C-D
# 3 1-2-3-4-5 1-2-3-D-E

英文:

1. `stringr` solution

Supply a custom function into str_replace_all() to replace the match of the last 2 numbers.

library(dplyr)
library(stringr)
df %&gt;%
  mutate(num_new = str_replace_all(num, &quot;\\d+-\\d+$&quot;, \(x) {
    str_c(LETTERS[as.integer(str_split_1(x, &#39;-&#39;))], collapse = &#39;-&#39;)
  }))

2. `tidyr` solution

separate_wider_regex() + unite()

library(dplyr)
library(tidyr)
df %&gt;%
  separate_wider_regex(
    num,
    patterns = c(col1 = &quot;.+&quot;, &quot;-&quot;, col2 = &quot;\\d+&quot;, &quot;-&quot;, col3 = &quot;\\d+&quot;),
    cols_remove = FALSE
  ) %&gt;%
  mutate(across(col2:col3, ~ LETTERS[as.integer(.x)])) %&gt;%
  unite(num_new, col1:col3, sep = &quot;-&quot;)

Output

# # A tibble: 4 &#215; 2
#   num      num_new
#   &lt;chr&gt;    &lt;chr&gt;  
# 1 1-1-26-2 1-1-Z-B
# 2 1-2-2-4  1-2-B-D
# 3 1-2-4-5  1-2-D-E
# 4 1-3-25-1 1-3-Y-A

For a generalized case, i.e. not all strings in the column have equal amounts of numbers.

df &lt;- data.frame(num = c(&quot;1-2-3&quot;, &quot;1-2-3-4&quot;, &quot;1-2-3-4-5&quot;))

Both solutions above can deal with this:

#         num   num_new
# 1     1-2-3     1-B-C
# 2   1-2-3-4   1-2-C-D
# 3 1-2-3-4-5 1-2-3-D-E

答案3

得分: 2

请尝试以下代码，我假设replacement_df与LETTER相同。

这里我使用了separate和unite函数。

library(tidyverse)
# 识别字符串的长度
len <- max(lengths(strsplit(df$num, '-')))
# 创建变量名称
nam <- paste0('l', seq(1:len))
# 选择最后2个名称
nam2 <- nam[(len-1):len]
df %>% separate(num, into = c(nam), sep = '-', remove = FALSE, fill = 'left') %>%
  mutate(across(all_of(nam2), ~LETTERS[as.numeric(.x)])) %>%
  unite(num_new, all_of(nam), sep = '-', na.rm = TRUE)

^{创建于2023年7月10日，使用 reprex v2.0.2}

        num   num_new
1     1-2-3     1-B-C
2   1-2-3-4   1-2-C-D
3 1-2-3-4-5 1-2-3-D-E

请注意，代码中的df$num和LETTERS是变量名，无需翻译。

英文:

Alternatively please try the below code where I assume that the replacement_df is same as that of LETTER

here I used separate and unite functions

library(tidyverse)
# identify the length of the string
len &lt;- max(lengths(strsplit(df$num,&#39;-&#39;)))
# create the variables names
nam &lt;- paste0(&#39;l&#39;,seq(1:len))
# select last 2 names
nam2 &lt;- nam[(len-1):len]
df %&gt;% separate(num,into = c(nam), sep = &#39;\\-&#39;, remove = F, fill = &#39;left&#39;) %&gt;% 
  mutate(across(all_of(nam2), ~LETTERS[as.numeric(.x)])) %&gt;% 
  unite(num_new,all_of(nam), sep = &#39;-&#39;, na.rm = T)

<sup>Created on 2023-07-10 with reprex v2.0.2</sup>

        num   num_new
1     1-2-3     1-B-C
2   1-2-3-4   1-2-C-D
3 1-2-3-4-5 1-2-3-D-E

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用strsplit()函数来分割一个数字字符串，并用字符替换部分。

问题

答案1

答案2

1. `stringr` 解决方案

2. `tidyr` 解决方案

输出

1. `stringr` solution

2. `tidyr` solution

Output

答案3

R Plumber API发布到Posit Connect后，在60秒后超时。

生成一个以”±”分隔的描述性统计表。

在分面 ggviolin 图中缺少 P 值（R，ggplot2）。

为Seurat对象中特定子集添加元数据

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论

问题

答案1

答案2

1. stringr 解决方案

2. tidyr 解决方案

输出

1. stringr solution

2. tidyr solution

Output

答案3

发表评论

1. `stringr` 解决方案

2. `tidyr` 解决方案

1. `stringr` solution

2. `tidyr` solution