2015年8月6日 04:56:00go评论118阅读模式

英文:

Write regex without negations

问题

在之前的帖子中，我请求帮助重写一个没有否定的正则表达式。

起始正则表达式：

https?:\/\/(?:.(?!https?:\/\/))+$

最终得到：

https?:[^:]*$

这个正则表达式工作得很好，但我注意到如果我的URL中除了http\s中的:之外还有:，它将无法选择。

这是一个不起作用的字符串：

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

你可以注意到:query2。

我该如何修改这里列出的第二个正则表达式，以便选择包含:的URL。

期望的输出：

http://websites.com/path/subpath/cc:query2

另外，我想选择直到第一次出现?=param之前的所有内容。

输入：
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

输出：

http://websites.com/path/subpath/cc:query2/text/

英文:

In a previous post I've asked for some help on rewriting a regex without negation

Starting regex:

https?:\/\/(?:.(?!https?:\/\/))+$

Ended up with:

https?:[^:]*$

This works fine but i've noticed that in case I will have : in my URL besides the : from http\s it will not select.

Here is a string which is not working:

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

You can notice the :query2

How can I modify the second regex listed here so it will select urls which contain :.

Expected output:

http://websites.com/path/subpath/cc:query2

Also I would like to select everything till the first occurance of ?=param

Input:
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

Output:

http://websites.com/path/subpath/cc:query2/text/

答案1

得分: 4

很遗憾，Go的正则表达式不支持lookaround。但是，你可以通过一种技巧来获取最后一个链接：贪婪地匹配所有可能的链接和其他字符，并使用捕获组捕获最后一个链接：

^(?:https?://|.)*(https?://\S+?)(?:\?=|$)

结合使用\S*?进行懒惰的空白匹配，这还可以捕获链接直到?=。

请参见正则表达式演示和Go演示。

var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`)
fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2", -1)[0][1])
fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param", -1)[0][1])

结果：

"http://websites.com/path/subpath/:query2"
"http://websites.com/path/subpath/cc:query2/text/"

如果最后一个链接中可能包含空格，请使用.+?：

^(?:https?://|.)*(https?://.+?)(?:\?=|$)

英文:

It is a pity that Go regex does not support lookarounds.
However, you can obtain the last link with a sort of a trick: match all possible links and other characters greedily and capture the last link with a capturing group:

^(?:https?://|.)*(https?://\S+?)(?:\?=|$)

Together with \S*? lazy whitespace matching, this also lets capture the link up to the ?=.

See regex demo and Go demo

var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`)
fmt.Printf(&quot;%q\n&quot;, r.FindAllStringSubmatch(&quot;sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2&quot;, -1)[0][1])
fmt.Printf(&quot;%q\n&quot;, r.FindAllStringSubmatch(&quot;sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param&quot;, -1)[0][1])

Results:

&quot;http://websites.com/path/subpath/:query2&quot;
&quot;http://websites.com/path/subpath/cc:query2/text/&quot;

In case there can be spaces in the last link, use just .+?:

^(?:https?://|.)*(https?://.+?)(?:\?=|$)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

不使用否定的情况下编写正则表达式。

问题

答案1

How to implement efficient in memory key value store in golang

在事务进行到一半时，如何轻松地提交工作并继续进行？

连接到SSH服务器

使用Go语言而不使用Swift或Objective-C，是否可以创建iOS应用程序？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。