英文:
Pattern to extract all matches of URLs within a string
问题
我有一些类似于 http://app14.co.ad.local:90/ATT\me1111419.org
的 URL 在字符串中。点 .
和反斜杠 \
或正斜杠 /
的数量不是固定的。URL 以 "http
" 开头,以点+("org","com","net") 结尾
字符串 | 期望结果 |
---|---|
阅读 http://app14.co.ad.local:90/ATT\me1111419.org blasss 测试 wwww http://app14.co.ad.local:90/ATT\me1.111.com xxxxbb aaa<br>qwer fff http://app14.co.ad.local:90/ATT\bbb1419.net www | http://app14.co.ad.local:90/ATT\me1111419.org<br>http://app14.co.ad.local:90/ATT\me1.111.com<br>http://app14.co.ad.local:90/ATT\b.bb1.419.net |
下面的代码只有在字符串仅包含 URL 而没有其他单词时才能正常工作。我对模式本身有问题。
Option Explicit
Option Compare Text
Function RegexMatches(strInput As String) As String
Dim re As New RegExp
Dim rMatch As Object, s As String, arrayMatches(), i As Long
With re
.Pattern = "(http:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?"
.Global = True
.MultiLine = True
.IgnoreCase = True
End With
If re.test(strInput) Then
For Each rMatch In re.Execute(strInput)
ReDim Preserve arrayMatches(i)
arrayMatches(i) = rMatch.Value
i = i + 1
Next
End If
RegexMatches = Join(arrayMatches, vbLf)
End Function
英文:
I have some URLs like this http://app14.co.ad.local:90/ATT\me1111419.org
within a string.
The numbers of dots .
and backslash \
or forward slash /
is not constant.
The URLs start with "http
" and end with a dot+("org" , "com", "net")
Strings | expected result |
---|---|
Read me http://app14.co.ad.local:90/ATT\me1111419.org blasss test wwww http://app14.co.ad.local:90/ATT\me1.111.com xxxxbb aaa<br>qwer fff http://app14.co.ad.local:90/ATT\bbb1419.net www | http://app14.co.ad.local:90/ATT\me1111419.org<br>http://app14.co.ad.local:90/ATT\me1.111.com<br>http://app14.co.ad.local:90/ATT\b.bb1.419.net |
the below code will work correctly only if my string contains only URLs and no other words.
my problem with the pattern itself.
Option Explicit
Option Compare Text
Function RegexMatches(strInput As String) As String
Dim re As New RegExp
Dim rMatch As Object, s As String, arrayMatches(), i As Long
With re
.Pattern = "(http:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?"
.Global = True
.MultiLine = True
.IgnoreCase = True
End With
If re.test(strInput) Then
For Each rMatch In re.Execute(strInput)
ReDim Preserve arrayMatches(i)
arrayMatches(i) = rMatch.Value
i = i + 1
Next
End If
RegexMatches = Join(arrayMatches, vbLf)
End Function
答案1
得分: 1
你可以使用:
http:// # 匹配 'http://'
\S+ # 后跟一个或多个非空白字符
\.(?:net|com|org) # 然后是 '.net'、'.com' 或 '.org' 中的一个。
在 regex101.com 上尝试它。
英文:
You can use:
http:// # Match 'http://'
\S+ # followed by one or more non-whitespace characters
\.(?:net|com|org) # then either '.net', '.com' or '.org'.
Try it on regex101.com.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论