英文:
Rewrite regex without negation
问题
我已经编写了这个正则表达式来帮助我从一些文本文件中提取一些链接:
https?:\/\/(?:.(?!https?:\/\/))+$
因为我正在使用golang/regexp库,所以我无法使用它,因为我使用了否定 (?!..
。
我想要做的是选择从最后一个出现的http/https直到末尾的所有文本。
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> 输出: http://websites.com/path/subpath/#query2
有人可以帮我找到一个解决方案吗?我已经花了几个小时尝试不同的方法来复现相同的结果,但没有成功。
英文:
I have wrote this regex to help me extract some links from some text files:
https?:\/\/(?:.(?!https?:\/\/))+$
Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..
What I would like to do with it, is to select all the text from the last occurance of http/https till the end.
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> Output: http://websites.com/path/subpath/#query2
Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.
答案1
得分: 3
请尝试这个正则表达式:
https?:[^:]*$
答案2
得分: 2
前瞻存在有其原因。
然而,如果你坚持要找一个等价的替代方案,一个通用的策略是:
(?!xyz)
在某种程度上等价于:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
话虽如此,希望我没有犯任何错误:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$
英文:
The lookaheads exist for a reason.
However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:
(?!xyz)
is somewhat equivalent to:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
With that said, hopefully I didn't make any mistakes:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论