Lookahead/lookbehind every in word boundary

huangapple go评论51阅读模式
英文:

Lookahead/lookbehind every in word boundary

问题

I can help with the translation. Here are the translated parts:

"First time here, was always lurking around an able to find answers, but not this time." -> 第一次来到这里,一直潜伏着以寻找答案,但这一次不行。

"I'm doing a REGEX with lookbehind and lookahead to show me where do I start having only letters and only numbers." -> 我正在使用具有后顾和前瞻的正则表达式,以显示从哪里开始只包含字母和数字。

"Some examples:" -> 一些示例:

"| is the pointer" -> | 表示指针

"I tried a lot to no avail, everytime I reach a problem of 2 matches when there should be none, or no matches when there should be one." -> 我尝试了很多次,但毫无结果,每次都会遇到应该没有匹配项却有两个匹配项的问题,或者应该有一个匹配项却没有匹配项。

"If it was possible to make something like 'If anything behind the pointer is a letter, negative'" -> 如果能够做出类似于“如果指针后面的任何内容是字母,则为负”的操作

"with lookbehind or likewise with lookahead, I think I could do it. But I couldn't find an example or way to do it." -> 使用后顾或类似的前瞻,我认为我可以做到。但我找不到示例或方法来实现它。

"EDIT: I'm using this regex in Java, inside a ReplaceAll() to add a space to a full sentence. Real example:" -> 编辑:我正在Java中使用这个正则表达式,在ReplaceAll()内部,以在完整句子中添加空格。真实示例:

"'NEW CITY - TCM34759' = 'NEW CITY - TCM 35779'" -> 'NEW CITY - TCM34759' = 'NEW CITY - TCM 35779'

"'TOWER CBM25432' = 'TOWER CBM 25432'" -> 'TOWER CBM25432' = 'TOWER CBM 25432'

英文:

First time here, was always lurking around an able to find answers, but not this time.

I'm doing a REGEX with lookbehind and lookahead to show me where do I start having only letters and only numbers.

Some examples:

| is the pointer

AAA111 -> AAA|111
222BBB -> 222|BBB
AAA111AAA -> No pointer, cause there is letters after the number sequence
AAA111AAA111 -> No pointer, cause in the second, there is letters behind the number

I tried a lot to no avail, everytime I reach a problem of 2 matches when there should be none, or no matches when there should be one.

If it was possible to make something like "If anything behind the pointer is a letter, negative"
with lookbehind or likewise with lookahead, I think I could do it. But I couldn't find an example or way to do it.

EDIT: I'm using this regex in Java, inside a ReplaceAll() to add a space to a full sentence. Real example:

'NEW CITY - TCM34759' = 'NEW CITY - TCM 35779'
'TOWER CBM25432' = 'TOWER CBM 25432'

答案1

得分: 0

这似乎很难使用单个正则表达式模式处理,建议采用两步方法。首先,使用以下正则表达式模式找到匹配项:

^(?:[A-Za-z]+[0-9]+|[0-9]+[A-Za-z]+)$

然后,使用此模式标记将字母与数字或数字与字母叉开的位置:

(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])
英文:

This seems difficult to handle using a single regex pattern with lookarounds. I suggest a two step approach. First, find matches using the following regex pattern:

^(?:[A-Za-z]+[0-9]+|[0-9]+[A-Za-z]+)$

Then, use this pattern to flag the position which bisects letters from numbers or vice-versa:

(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])

答案2

得分: 0

Java正则表达式引擎允许有限的变长向后查找,通过它们可以检查替换位置的单词开头:

(?=\d+\b)(?<=\b[A-Z]{1,100})|(?=[A-Z]+\b)(?<=\b\d{1,100})
英文:

Java regex engine allows limited variable length lookbehinds, with them you can check the start of the word from the replacement position:

(?=\d+\b)(?<=\b[A-Z]{1,100})|(?=[A-Z]+\b)(?<=\b\d{1,100})

demo

答案3

得分: 0

这不是使用断言的问题。
如果您想将断言用作练习,请随意,但它会减慢搜索速度。

最简单的方法是匹配括号内的分隔符,其中该块被单词边界括起来。

这 4 个分组只是线性收集起来,以制定一个简单的替换。

查找 (?i)\b(?:([a-z]+)(\d+)|(\d+)([a-z]+))\b
替换 $1$3 $2$4

英文:

This is not the kind of problem to use assertions.
If you want to use assertions as an exercise feel free, but it does slow down the search.

The easiest way is to match the separation inside an alternation, where the block is surrounded by a word boundary.

These 4 groups are just linearly gathered to formulate a straightforward substitution.

Find (?i)\b(?:([a-z]+)(\d+)|(\d+)([a-z]+))\b
Replace $1$3 $2$4

https://regex101.com/r/2X9pyr/1

 (?i)
 \b 
 (?:
    ( [a-z]+ )         # (1)
    ( \d+ )            # (2)
  | 
    ( \d+ )            # (3)
    ( [a-z]+ )         # (4)
 )
 \b

huangapple
  • 本文由 发表于 2023年3月20日 23:18:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75792109.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定