2023年4月19日 22:34:29go评论95阅读模式

英文:

Conditional replacement of optional group with gsub

问题

以下是您要翻译的内容：

"A user asked me how to do this in https://stackoverflow.com/questions/76054997/how-to-italicize-select-words-in-a-ggplot-legend/76055093?noredirect=1#comment134133550_76055093, and I'm not happy with my workaround.

The aim is to add enclosing * around all character vector elements except for given strings. Let's assume for this example that those would always be found at the beginning. I am using an optional capture for the first group and then include the second group with the asterisks. The problem arises when the searched word stands alone and there is no following string.

I've included the desired output and some attempts in the code.

v &lt;- head(rownames(mtcars))
## does also not work with (.*)?, nor with (.+) nor (.+)? gsub(&quot;(Hornet |Valiant)?(.*)&quot;, &quot;\\\*\\\*&quot;, v) 
#&gt; [1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
#&gt; [4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant**&quot;
## desired output
ifelse(grepl(&quot;Valiant&quot;, v), v, gsub(&quot;(Hornet )?(.*)&quot;, &quot;\\\*\\\*&quot;, v) )
#&gt; [1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
#&gt; [4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant&quot;

英文:

A user asked me how to do this in https://stackoverflow.com/questions/76054997/how-to-italicize-select-words-in-a-ggplot-legend/76055093?noredirect=1#comment134133550_76055093, and I'm not happy with my workaround.

I've included the desired output and some attempts in the code.

v &lt;- head(rownames(mtcars))
## does also not work with (.*)?, nor with (.+) nor (.+)?
gsub(&quot;(Hornet |Valiant)?(.*)&quot;, &quot;\\\*\\\*&quot;, v) 
#&gt; [1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
#&gt; [4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant**&quot;
## desired output
ifelse(grepl(&quot;Valiant&quot;, v), v, gsub(&quot;(Hornet )?(.*)&quot;, &quot;\\\*\\\*&quot;, v) )
#&gt; [1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
#&gt; [4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant&quot;

答案1

得分: 3

gsub 函数支持的正则表达式引擎中，没有一种能够支持条件替换模式。

你可以使用：

v &lt;- c(&quot;Mazda RX4&quot;,&quot;Mazda RX4 Wag&quot;,&quot;Datsun 710&quot;,&quot;Hornet 4 Drive&quot;,&quot;Hornet Sportabout&quot;,&quot;Valiant&quot;)
gsub(&quot;^(?:Hornet|Valiant)\\s*(*SKIP)(*F)|(.+)&quot;, &quot;*\*&quot;, v, perl=TRUE)

请查看 regex demo 和 R demo online。

输出：

[1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
[4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant&quot;

要确保第一个单词完全匹配，请添加 \b："^(?:Hornet|Valiant)\\b\\s*(*SKIP)(*F)|(.+)"。

请确保使用 perl=TRUE。

正则表达式详解：

^(?:Hornet|Valiant)\s*(*SKIP)(*F) - 匹配字符串开头的 Hornet 或 Valiant，然后零个或多个空格，并一旦匹配，丢弃并失败匹配，并继续从失败位置查找下一个匹配。
| - 或
(.+) - 匹配除换行符之外的一个或多个字符，尽可能多（字符串的剩余部分）。

英文:

Neither of the regex engines that can be used with gsub support a conditional replacement pattern.

You can use

v &lt;- c(&quot;Mazda RX4&quot;,&quot;Mazda RX4 Wag&quot;,&quot;Datsun 710&quot;,&quot;Hornet 4 Drive&quot;,&quot;Hornet Sportabout&quot;,&quot;Valiant&quot;)
gsub(&quot;^(?:Hornet|Valiant)\\s*(*SKIP)(*F)|(.+)&quot;, &quot;*\*&quot;, v, perl=TRUE)

See the regex demo and the R demo online.

Output:

[1] &quot;*Mazda RX4*&quot;         &quot;*Mazda RX4 Wag*&quot;     &quot;*Datsun 710*&quot;       
[4] &quot;Hornet *4 Drive*&quot;    &quot;Hornet *Sportabout*&quot; &quot;Valiant&quot;

To make sure the first words are matched as whole words add \b: "^(?:Hornet|Valiant)\\b\\s*(*SKIP)(*F)|(.+)".

Make sure to use the perl=TRUE.

Regex details:

^(?:Hornet|Valiant)\s*(*SKIP)(*F) - match Hornet or Valiant at the start of the string, then zero or more whitespaces, and once matched, discard and fail the match, and proceed to look for the next match from the failure position
| - or
(.+) - matches one or more chars other than line break chars as many as possible (the rest of the string).

答案2

得分: 3

One more solution is to use possessive quantifier for the first group and one-or-more inside the second:

^(Hornet ?|Valiant ?)?+(.+)

This way if Hornet or Valiant were matched in the beginning of the string - no backtracking will occur, and string will be matched (and subsequently substituted) only if there is something after those.

Demo here.

英文:

One more solution is to use possessive quantifier for first group and one-or-more inside of second:

^(Hornet ?|Valiant ?)?+(.+)

This way if Hornet or Valiant were matched in the beginning of the string - no backtracking will occur, and string will be matched (ans subsequently substituted) only if there is something after those.

Demo here.

答案3

得分: 2

gsub 只有在字符串匹配提供的正则表达式时才执行替换。因此，要阻止 * 出现，你可以使正则表达式不匹配你的输入。

例如，在问题中提供的示例中，你可以使用负向先行断言来实现。结果如下：

^(?!(?:Hornet|Valiant)$)(Hornet|Valiant)?(.*)$

演示链接这里。

英文:

Less in depth and more hacky answer, but easier to understand one)

gsub executes substitution only when string matches provided regex. So to stop * from appearing you can make regex stop matching your input.

For example provided in question you can do it with negative lookahead. Result would look like this:

^(?!(?:Hornet|Valiant)$)(Hornet|Valiant)?(.*)$

Demo here.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用 gsub 条件替换可选组

问题

答案1

答案2

答案3

将数据框中的NA值使用for循环和if语句转换为任意整数

使用R解析文本文件

PostgreSQL连接字符串的正则表达式

Building problem of R-devel in Windows 11 and binutils

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论