英文:
Determining the functional length of a pattern in a "Like" comparison
问题
有没有办法确定在“Like”比较中模式的功能长度?
例如,“[A-Z]”的功能长度为1,但Len("[A-Z]")会得到答案5(因为字符串本身长度为五个字符)。
我认为能够确定模式的功能长度将使查找字符串的方式更灵活。
我有可用的代码,但它依赖于要搜索的字符串具有空格,并且要找到的子字符串两侧都有空格。
这是我尝试的代码;它不起作用,因为len(pattern)比我试图找到的模式长 - 在我的特定示例中,模式是#####[A-Z][A-Z]
,以查找(例如)11111AA
或22222BB
- 五个数字后跟两个大写字母。
Function ExtractMatch(Source As String, Pattern As String) As String
' 提取与给定模式匹配的子字符串
Dim i As Long, lenP As Long, Test As String
lenP = Len(Pattern)
For i = 1 To Len(Source) - lenP
Test = Mid(Source, i, lenP)
Debug.Print Test
If Test Like Pattern Then
ExtractMatch = Test
Exit For
End If
Next
End Function
背景信息:我正在尝试从手工输入的某处提取报价编号。示例数据和期望输出 - 第5行当前未捕获,因为有一个拼写错误,并且在引号编号之前没有空格。
英文:
Is there a way to determine the functional length of a pattern in a "Like" comparison?
For example, "[A-Z]" has a functional length of 1, but Len("[A-Z]") gives the answer 5 (because the string itself is five characters long).
I think being able to determine the functional length of the pattern would enable a more flexible way of looking through the string.
I have working code but it relies on the string being searched having spaces, and the substring to be found having a space on either side.
This is the code I tried; it doesn't work, because len(pattern) is longer than the pattern I'm trying to find - in my specific example the pattern is #####[A-Z][A-Z]
, to find (for example) 11111AA
or 22222BB
- five numbers followed by two upper case letters.
Function ExtractMatch(Source As String, Pattern As String) As String
' extracts a substring matching the given pattern
Dim i As Long, lenP As Long, Test As String
lenP = Len(Pattern)
For i = 1 To Len(Source) - lenP
Test = Mid(Source, i, lenP)
Debug.Print Test
If Test Like Pattern Then
ExtractMatch = Test
Exit For
End If
Next
End Function
For context I am trying to extract a quote number from somewhere in a hand-typed entry. Sample data and desired output - line 5 is currently not captured as there's a typo, and no space before the quote number.
答案1
得分: 3
以下是已翻译的内容:
假设在模式中没有 *
,我们可以使用以下函数获取模式的长度:
Function getPatterLength(ByVal Pattern As String) As Long
If InStr(Pattern, ""*"") > 0 Then
getPatterLength = -1
Exit Function
End If
Dim Parts() As String
Parts = Split(Pattern, ""[")
Dim i As Long
For i = 0 To UBound(Parts)
If InStr(Parts(i), ""]"") > 0 Then
Parts(i) = Mid(Parts(i), InStr(Parts(i), ""]""))
End If
Next
getPatterLength = Len(Join(Parts, """"))
End Function
为了提取函数添加一个参数来表示模式长度会更容易:
Function ExtractMatch(Source As String, Pattern As String, PatterLength as long) As String
' 提取与给定模式匹配的子字符串
Dim i As Long, Test As String
For i = 1 To Len(Source) - PatterLength
Test = Mid(Source, i, PatterLength)
Debug.Print Test
If Test Like Pattern Then
ExtractMatch = Test
Exit Function
End If
Exit For
End If
Next
End Function
英文:
Assuming that there are no *
in the pattern we can get the length of the pattern using this function:
Function getPatterLength(ByVal Pattern As String) As Long
If InStr(Pattern, "*") > 0 Then
getPatterLength = -1
Exit Function
End If
Dim Parts() As String
Parts = Split(Pattern, "[")
Dim i As Long
For i = 0 To UBound(Parts)
If InStr(Parts(i), "]") > 0 Then
Parts(i) = Mid(Parts(i), InStr(Parts(i), "]"))
End If
Next
getPatterLength = Len(Join(Parts, ""))
End Function
It would be easier to add a parameter to the extract function for the pattern length.
Function ExtractMatch(Source As String, Pattern As String, PatterLength as long) As String
' extracts a substring matching the given pattern
Dim i As Long, Test As String
For i = 1 To Len(Source) - PatterLength
Test = Mid(Source, i, PatterLength)
Debug.Print Test
If Test Like Pattern Then
ExtractMatch = Test
Exit Function
End If
Exit For
End If
Next
End Function
答案2
得分: 3
在单元格B2
中的公式:
=MAP(A2:A5,LAMBDA(x,FILTERXML("<t><s>"&TEXTJOIN("</s><s>",,MID(x,SEQUENCE(LEN(x)),7))&"</s></t>","//s[string-length(normalize-space())=7][substring(.,1,5)*0=0][translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][1]"))
XPath的拆解与逻辑:
[string-length(normalize-space())=7]
- 我们需要测试节点的长度是否等于7个字符(经过规范化:由于第3个谓词的工作原理,这很重要)。在正则表达式中的表示为:^.{7}$
;[substring(.,1,5)*0=0]
- 测试节点的前5个字符是否为数字。在正则表达式中的表示为:^\d{5}
;[translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']
- 确保最后两个字符是大写字母。在正则表达式中的表示为:[A-Z]{2}$
;[1]
- 从过滤后的节点列表中获取第一个节点。
英文:
Thought it would be fun (for some at least) to see if this can be done through FILTERXML()
:
Formula in B2
:
=MAP(A2:A5,LAMBDA(x,FILTERXML("<t><s>"&TEXTJOIN("</s><s>",,MID(x,SEQUENCE(LEN(x)),7))&"</s></t>","//s[string-length(normalize-space())=7][substring(.,1,5)*0=0][translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][1]")))
The deconstructed xpath anding predicates:
-
[string-length(normalize-space())=7]
- We need to test if the node's length equals 7 characters (after normalization: This is important due to the working of the 3rd predicate). In regex terms:^.{7}$
; -
[substring(.,1,5)*0=0]
- Test that the first 5 characters of the node are numeric. In regex terms:^\d{5}
; -
[translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']
- Assert that the last two characters are uppercase alpha chars. In regex terms:[A-Z]{2}$
; -
[1]
- Retrieve the 1st node from resulting filtered node-list.
答案3
得分: 2
您可以使用正则表达式来提取子匹配,如下所示:
Public Sub Example()
Debug.Print ExtractSubstring("Quote number 11111AA refers", "[0-9]{5}[A-Z]{2}")
End Sub
Public Function ExtractSubstring(ByVal InputString As String, ByVal Pattern As String) As String
Dim RetVal As String
Dim AllMatches As Object
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Pattern = Pattern
.Global = True
.IgnoreCase = True ' 如果需要区分大小写,请更改此选项
End With
Set AllMatches = RegEx.Execute(InputString)
If AllMatches.Count <> 0 Then
RetVal = AllMatches.Item(0)
End If
ExtractSubstring = RetVal
End Function
或者直接像用户自定义函数(UDF)一样使用:
=ExtractSubstring(A2, "[0-9]{5}[A-Z]{2}")
请注意,如果存在更多与此模式匹配的项,这将仅返回第一个匹配项。
英文:
You can use Regular Expressions to extract a submatch like below:
Public Sub Example()
Debug.Print ExtractSubstring("Quote number 11111AA refers", "[0-9]{5}[A-Z]{2}")
End Sub
Public Function ExtractSubstring(ByVal InputString As String, ByVal Pattern As String) As String
Dim RetVal As String
Dim AllMatches As Object
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Pattern = Pattern
.Global = True
.IgnoreCase = True ' change this if you want case sensitivity
End With
Set AllMatches = RegEx.Execute(InputString)
If AllMatches.Count <> 0 Then
RetVal = AllMatches.Item(0)
End If
ExtractSubstring = RetVal
End Function
Or even use it directly like a UDF (User Defined Function)
=ExtractSubstring(A2,"[0-9]{5}[A-Z]{2}")
Note this will only return the first match if there are more matches with this pattern.
答案4
得分: 0
为什么不使用正则表达式:
Public Sub test()
Dim pattern As String
pattern = "\d{5}[A-Z]{2}"
Debug.Print regex_PatternMatch(pattern, "11111AA") '将返回true
Debug.Print regex_PatternMatch(pattern, "111AA") '将返回false
End Sub
Public Function regex_PatternMatch(pattern As String, textToCheck As String) As Boolean
Dim regEx As New RegExp
With regEx
.pattern = pattern
regex_PatternMatch = .test(textToCheck)
End With
End Function
您需要添加对VB脚本正则表达式的引用。
英文:
Why don't you use regular expressions:
Public Sub test()
Dim pattern As String
pattern = "\d{5}[A-Z]{2}"
Debug.Print regex_PatternMatch(pattern, "11111AA") 'will return true
Debug.Print regex_PatternMatch(pattern, "111AA") 'will return false
End Sub
Public Function regex_PatternMatch(pattern As String, textToCheck As String) As Boolean
Dim regEx As New RegExp
With regEx
.pattern = pattern
regex_PatternMatch = .test(textToCheck)
End With
End Function
You have to add a reference to VB Script regular expressions
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论