英文:
Determining the functional length of a pattern in a "Like" comparison
问题
有没有办法确定在“Like”比较中模式的功能长度?
例如,“[A-Z]”的功能长度为1,但Len("[A-Z]")会得到答案5(因为字符串本身长度为五个字符)。
我认为能够确定模式的功能长度将使查找字符串的方式更灵活。
我有可用的代码,但它依赖于要搜索的字符串具有空格,并且要找到的子字符串两侧都有空格。
这是我尝试的代码;它不起作用,因为len(pattern)比我试图找到的模式长 - 在我的特定示例中,模式是#####[A-Z][A-Z],以查找(例如)11111AA或22222BB - 五个数字后跟两个大写字母。
Function ExtractMatch(Source As String, Pattern As String) As String
' 提取与给定模式匹配的子字符串
    Dim i As Long, lenP As Long, Test As String
    lenP = Len(Pattern)
        For i = 1 To Len(Source) - lenP
            Test = Mid(Source, i, lenP)
            Debug.Print Test
            If Test Like Pattern Then
            ExtractMatch = Test
        Exit For
    End If
    Next
    
End Function
背景信息:我正在尝试从手工输入的某处提取报价编号。示例数据和期望输出 - 第5行当前未捕获,因为有一个拼写错误,并且在引号编号之前没有空格。
英文:
Is there a way to determine the functional length of a pattern in a "Like" comparison?
For example, "[A-Z]" has a functional length of 1, but Len("[A-Z]") gives the answer 5 (because the string itself is five characters long).
I think being able to determine the functional length of the pattern would enable a more flexible way of looking through the string.
I have working code but it relies on the string being searched having spaces, and the substring to be found having a space on either side.
This is the code I tried; it doesn't work, because len(pattern) is longer than the pattern I'm trying to find - in my specific example the pattern is #####[A-Z][A-Z], to find (for example) 11111AA or 22222BB - five numbers followed by two upper case letters.
Function ExtractMatch(Source As String, Pattern As String) As String
' extracts a substring matching the given pattern
    Dim i As Long, lenP As Long, Test As String
    lenP = Len(Pattern)
        For i = 1 To Len(Source) - lenP
            Test = Mid(Source, i, lenP)
            Debug.Print Test
            If Test Like Pattern Then
            ExtractMatch = Test
        Exit For
    End If
    Next
    
End Function
For context I am trying to extract a quote number from somewhere in a hand-typed entry. Sample data and desired output - line 5 is currently not captured as there's a typo, and no space before the quote number.
答案1
得分: 3
以下是已翻译的内容:
假设在模式中没有 *,我们可以使用以下函数获取模式的长度:
Function getPatterLength(ByVal Pattern As String) As Long
    If InStr(Pattern, ""*"") > 0 Then
        getPatterLength = -1
        Exit Function
    End If
    Dim Parts() As String
    Parts = Split(Pattern, ""[")
    Dim i As Long
    For i = 0 To UBound(Parts)
        If InStr(Parts(i), ""]"") > 0 Then
            Parts(i) = Mid(Parts(i), InStr(Parts(i), ""]""))
        End If
    Next
    getPatterLength = Len(Join(Parts, """"))
End Function
为了提取函数添加一个参数来表示模式长度会更容易:
Function ExtractMatch(Source As String, Pattern As String, PatterLength as long) As String
    ' 提取与给定模式匹配的子字符串
    Dim i As Long, Test As String
    For i = 1 To Len(Source) - PatterLength
        Test = Mid(Source, i, PatterLength)
        Debug.Print Test
        If Test Like Pattern Then
            ExtractMatch = Test
            Exit Function
        End If
        Exit For
    End If
    Next
End Function
英文:
Assuming that there are no * in the pattern we can get the length of the pattern using this function:
Function getPatterLength(ByVal Pattern As String) As Long
    If InStr(Pattern, "*") > 0 Then
        getPatterLength = -1
        Exit Function
    End If
    Dim Parts() As String
    Parts = Split(Pattern, "[")
    Dim i As Long
    For i = 0 To UBound(Parts)
        If InStr(Parts(i), "]") > 0 Then
            Parts(i) = Mid(Parts(i), InStr(Parts(i), "]"))
        End If
    Next
    getPatterLength = Len(Join(Parts, ""))
End Function
It would be easier to add a parameter to the extract function for the pattern length.
Function ExtractMatch(Source As String, Pattern As String, PatterLength as long) As String
' extracts a substring matching the given pattern
    Dim i As Long, Test As String
        For i = 1 To Len(Source) - PatterLength
            Test = Mid(Source, i, PatterLength)
            Debug.Print Test
            If Test Like Pattern Then
            ExtractMatch = Test
            Exit Function
        End If
        Exit For
    End If
    Next
    
End Function
答案2
得分: 3
在单元格B2中的公式:
=MAP(A2:A5,LAMBDA(x,FILTERXML("<t><s>"&TEXTJOIN("</s><s>",,MID(x,SEQUENCE(LEN(x)),7))&"</s></t>","//s[string-length(normalize-space())=7][substring(.,1,5)*0=0][translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][1]"))
XPath的拆解与逻辑:
[string-length(normalize-space())=7]- 我们需要测试节点的长度是否等于7个字符(经过规范化:由于第3个谓词的工作原理,这很重要)。在正则表达式中的表示为:^.{7}$;[substring(.,1,5)*0=0]- 测试节点的前5个字符是否为数字。在正则表达式中的表示为:^\d{5};[translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']- 确保最后两个字符是大写字母。在正则表达式中的表示为:[A-Z]{2}$;[1]- 从过滤后的节点列表中获取第一个节点。
英文:
Thought it would be fun (for some at least) to see if this can be done through FILTERXML():
Formula in B2:
=MAP(A2:A5,LAMBDA(x,FILTERXML("<t><s>"&TEXTJOIN("</s><s>",,MID(x,SEQUENCE(LEN(x)),7))&"</s></t>","//s[string-length(normalize-space())=7][substring(.,1,5)*0=0][translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][1]")))
The deconstructed xpath anding predicates:
- 
[string-length(normalize-space())=7]- We need to test if the node's length equals 7 characters (after normalization: This is important due to the working of the 3rd predicate). In regex terms:^.{7}$; - 
[substring(.,1,5)*0=0]- Test that the first 5 characters of the node are numeric. In regex terms:^\d{5}; - 
[translate(substring(.,6),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']- Assert that the last two characters are uppercase alpha chars. In regex terms:[A-Z]{2}$; - 
[1]- Retrieve the 1st node from resulting filtered node-list. 
答案3
得分: 2
您可以使用正则表达式来提取子匹配,如下所示:
Public Sub Example()
    Debug.Print ExtractSubstring("Quote number 11111AA refers", "[0-9]{5}[A-Z]{2}")
End Sub
Public Function ExtractSubstring(ByVal InputString As String, ByVal Pattern As String) As String
    Dim RetVal As String
    Dim AllMatches As Object
    Dim RegEx As Object
    Set RegEx = CreateObject("vbscript.regexp")
    
    With RegEx
        .Pattern = Pattern
        .Global = True
        .IgnoreCase = True  ' 如果需要区分大小写,请更改此选项
    End With
    Set AllMatches = RegEx.Execute(InputString)
    
    If AllMatches.Count <> 0 Then
        RetVal = AllMatches.Item(0)
    End If
    
    ExtractSubstring = RetVal
End Function
或者直接像用户自定义函数(UDF)一样使用:
=ExtractSubstring(A2, "[0-9]{5}[A-Z]{2}")
请注意,如果存在更多与此模式匹配的项,这将仅返回第一个匹配项。
英文:
You can use Regular Expressions to extract a submatch like below:
Public Sub Example()
    Debug.Print ExtractSubstring("Quote number 11111AA refers", "[0-9]{5}[A-Z]{2}")
End Sub
Public Function ExtractSubstring(ByVal InputString As String, ByVal Pattern As String) As String
    Dim RetVal As String
    Dim AllMatches As Object
    Dim RegEx As Object
    Set RegEx = CreateObject("vbscript.regexp")
    
    With RegEx
        .Pattern = Pattern
        .Global = True
        .IgnoreCase = True  ' change this if you want case sensitivity
    End With
    Set AllMatches = RegEx.Execute(InputString)
    
    If AllMatches.Count <> 0 Then
        RetVal = AllMatches.Item(0)
    End If
    
    ExtractSubstring = RetVal
End Function
Or even use it directly like a UDF (User Defined Function)
=ExtractSubstring(A2,"[0-9]{5}[A-Z]{2}")
Note this will only return the first match if there are more matches with this pattern.
答案4
得分: 0
为什么不使用正则表达式:
Public Sub test()
Dim pattern As String
pattern = "\d{5}[A-Z]{2}"
Debug.Print regex_PatternMatch(pattern, "11111AA")  '将返回true
Debug.Print regex_PatternMatch(pattern, "111AA")   '将返回false
End Sub
Public Function regex_PatternMatch(pattern As String, textToCheck As String) As Boolean
    Dim regEx As New RegExp
    With regEx
        .pattern = pattern
        regex_PatternMatch = .test(textToCheck)
    End With
End Function
您需要添加对VB脚本正则表达式的引用。
英文:
Why don't you use regular expressions:
Public Sub test()
Dim pattern As String
pattern = "\d{5}[A-Z]{2}"
Debug.Print regex_PatternMatch(pattern, "11111AA")  'will return true
Debug.Print regex_PatternMatch(pattern, "111AA")   'will return false
End Sub
Public Function regex_PatternMatch(pattern As String, textToCheck As String) As Boolean
    Dim regEx As New RegExp
    With regEx
        .pattern = pattern
        regex_PatternMatch = .test(textToCheck)
    End With
End Function
You have to add a reference to VB Script regular expressions
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。



评论