Using negative lookahead in RegExp to correctly match image URLs wrapped in [img] tags?

huangapple go评论53阅读模式
英文:

Using negative lookahead in RegExp to correctly match image URLs wrapped in

问题

我想使用正则表达式来匹配一个图像URL,有两种情况:

第一种情况是正常的图像URL,以.png结尾,可能包含一些查询字符串,此时正则表达式将完全匹配它。

https://test.com/cat.png?pkey=abcd

在第二种情况下,URL被包装在**

标记中,正则表达式将期望不匹配**任何字符串。

现在我已经编写了这个正则表达式,在第一种情况下运行得很好

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?!\[)

但在第二种情况下不起作用,它仍然匹配到URL的倒数第二个字符。我该如何修改我的正则表达式以实现我的目标?

这里是正则表达式链接:https://regexr.com/7eohf

英文:

Negative lookahead to match itself in RegExp

I want to use a RegExp to match an image url that will have two cases:

The first case it is a normal image URL, ending with .png and may contain some query string, at which point the RegExp will match it in its entirety.

https://test.com/cat.png?pkey=abcd

In the second case, the URL is wrapped in an

tag, the RegExp will be expected not to any string.

Now I have written this regular expression, it works great on the first case

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?!\[)

however it does not work with the second case, it will still be matched up to the penultimate character of the URL. How can I modify my RegExp to achieve my goal?

Here's RegExp link: https://regexr.com/7eohf

答案1

得分: 3

尝试在.png之后立即预测前瞻:

(https:\/\/.*?\.png(?!\S*\[)(?:\?[\w&=_-]*)?)

其中\S匹配任何不是空格的字符(您可以根据需要替换它)。

在此处查看演示:链接

或者,您可以强制要求最后匹配的字符后面不跟随[?\w&=_-]

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?![?\w&=_-]|\[)
英文:

Try to ancitipate the lookahead immediately after .png:

(https:\/\/.*?\.png(?!\S*\[)(?:\?[\w&=_-]*)?)

Where \S matches any character that is not a space (you can replace it as you wish).

See a demo here.

Alternatively, you can impose that the last character matched is not followed by [?\w&=_-]:

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?![?\w&=_-]|\[)