使用Python的re模块,替换分隔符后的单词中的每个符号。

huangapple go评论61阅读模式
英文:

Replace every symbol of the word after delimiter using python re

问题

I can provide a translation for the code-related portion:

我想要将横线后的单词中的每个符号替换为`*`。

例如
```python
asd-wqe ffvrf    =>    asd-*** ffvrf

在 TypeScript 的正则表达式中,可以使用 (?<=-\w*)\w 和替换 * 来实现。但是默认的 Python 正则表达式引擎需要固定宽度的后顾断言。

我能想到的最好方法是使用:

(?:(?<=-)|(?<=-\w)|(?<=-\w{2}))\w

并重复使用预定的大数次后顾断言,但这似乎不太可持续或优雅。

是否有可能使用默认的 re 模块以更优雅的方式完成这个任务?

测试演示可在此处找到:链接

附言:我知道存在支持可变长度后顾断言的替代正则表达式引擎,但如果可能的话,我想暂时坚持使用默认的引擎。

英文:

I would like to replace every symbol of a word after - with *.

For example:

asd-wqe ffvrf    =>    asd-*** ffvrf

In TS regex it could be done with (?<=-\w*)\w and replacement *. But default python regex engine requires lookbehinds of fixed width.

Best I can imaging is to use

(?:(?<=-)|(?<=-\w)|(?<=-\w{2}))\w

and repeat lookbehing some predetermined big number of times, but it seems not very sustainable or elegant.

Is it possible to use default re module for such a task with some more elegant pattern?

Demo for testing here.

P.S. I'm aware that alternative regex engines, that support lookbehind of variable length exist, but would like to stick with default one for a moment if possible.

答案1

得分: 2

I think you can not do that with Python re, as you want to match a single character knowing that to the left is - followed by optional word characters.

我认为你无法使用Python re完成这个任务,因为你想匹配一个单个字符,该字符左边是 -,后跟可选的单词字符。

I would write it like this with a callback and then get the length of the match for the replacement of the * chars.

我会这样写,使用回调函数,然后获取匹配的长度来替换为 * 字符。

import re

strings = [
    "asd-wqe ffvrf",
    "asd-ss sd",
    "a-word",
    "a-verylongword",
    "an-extremelyverylongword"
]
pattern = r"(?<=-)\w+"
for s in strings:
    print(re.sub(pattern, lambda x: len(x.group()) * "*", s))

Output

asd-*** ffvrf
asd-** sd
a-****
a-************
an-*********************

An alternative to a quantifier in a lookbehind assertion is using the \G anchor (which is also not supported by Python re).

在后行断言中,替代量词的方法是使用\G锚点(这也不受Python re支持)。

(?:-|\G(?!^))\K\w

正则表达式演示

英文:

I think you can not do that with Python re, as you want to match a single character knowing that to the left is - followed by optional word characters.

I would write it like this with a callback and then get the length of the match for the replacement of the * chars

import re

strings = [
    &quot;asd-wqe ffvrf&quot;,
    &quot;asd-ss sd&quot;,
    &quot;a-word&quot;,
    &quot;a-verylongword&quot;,
    &quot;an-extremelyverylongword&quot;
]
pattern = r&quot;(?&lt;=-)\w+&quot;
for s in strings:
    print(re.sub(pattern, lambda x: len(x.group()) * &quot;*&quot;, s))

Output

asd-*** ffvrf
asd-** sd
a-****
a-************
an-*********************

See a python demo.

<hr>

An alternative to a quantifier in a lookbehind assertion is using the \G anchor (which is also not supported by Python re)

 (?:-|\G(?!^))\K\w

Regex demo

答案2

得分: 1

你可以捕获-后面的所有字母字符,并将回调函数传递给re.sub,用与匹配相同长度的星号字符串替换匹配项。

s = 'asd-wqe ffvrf'
res = re.sub(r'(?<=-)\w+', lambda m: '*' * len(m.group()), s)
英文:

You can capture all the word characters after - and pass a callback to re.sub that replaces the match with a string of asterisks of the same length.

s = &#39;asd-wqe ffvrf&#39;
res = re.sub(r&#39;(?&lt;=-)\w+&#39;, lambda m: &#39;*&#39; * len(m.group()), s)

huangapple
  • 本文由 发表于 2023年4月10日 23:35:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/75978476.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定