重写不带否定的正则表达式。

huangapple go评论89阅读模式
英文:

Rewrite regex without negation

问题

我已经编写了这个正则表达式来帮助我从一些文本文件中提取一些链接:

https?:\/\/(?:.(?!https?:\/\/))+$

因为我正在使用golang/regexp库,所以我无法使用它,因为我使用了否定 (?!..

我想要做的是选择从最后一个出现的http/https直到末尾的所有文本。

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2

=> 输出: http://websites.com/path/subpath/#query2

有人可以帮我找到一个解决方案吗?我已经花了几个小时尝试不同的方法来复现相同的结果,但没有成功。

英文:

I have wrote this regex to help me extract some links from some text files:

https?:\/\/(?:.(?!https?:\/\/))+$

Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..

What I would like to do with it, is to select all the text from the last occurance of http/https till the end.

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2

=> Output: http://websites.com/path/subpath/#query2

Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.

答案1

得分: 3

请尝试这个正则表达式:

https?:[^:]*$

在此处查看正则表达式实例。

英文:

Try this regex:

https?:[^:]*$

Regex live here.

答案2

得分: 2

前瞻存在有其原因。

然而,如果你坚持要找一个等价的替代方案,一个通用的策略是:

(?!xyz)

在某种程度上等价于:

$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)

话虽如此,希望我没有犯任何错误:

https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$
英文:

The lookaheads exist for a reason.

However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:

(?!xyz)

is somewhat equivalent to:

$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)

With that said, hopefully I didn't make any mistakes:

https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$

huangapple
  • 本文由 发表于 2015年8月5日 20:40:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/31832852.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定