Excel正则表达式VBA

huangapple go评论67阅读模式
英文:

Excel Regex VBA

问题

我需要从Excel单元格中提取一部分字符串,如果它符合以下格式:
"DDD/L-DDD-L/DDD-L-DD/DD",其中D代表任何数字,L代表任何字母,不区分大小写。

以下是一些示例:

单元格内容:"BHL: 200/b-003-A/094-G-08/02"
提取的字符串:"200/b-003-A/094-G-08/02"

单元格内容:"UWI: 200/c-098-C/094-G-01/00"
提取的字符串:"200/c-098-C/094-G-01/00"

单元格内容:"TOTAL SULPHUR (mass%) = 0.013, BHL: 203/a-044-C/094-G-01/00"
提取的字符串:"203/a-044-C/094-G-01/00"

注意:数据格式非常不一致,因此查找关键字如"BHL"或"UWI"不是选项。

我已经创建了以下VBA函数,但它没有按预期工作。

英文:

I need to extract a part of string from an excel cell if it matches the following format:
"DDD/L-DDD-L/DDD-L-DD/DD" where D represents any digit and L represents any letter, upper or lower case.

Here are some examples: <br />
Cell content: "BHL: 200/b-003-A/094-G-08/02" <br />
Extracted string: "200/b-003-A/094-G-08/02" <br />

Cell content: "UWI: 200/c-098-C/094-G-01/00" <br />
Extracted string: "200/c-098-C/094-G-01/00" <br />

Cell content: "TOTAL SULPHUR (mass%) = 0.013, BHL: 203/a-044-C/094-G-01/00" <br />
Extracted string: "203/a-044-C/094-G-01/00" <br />

NOTE: The data formatting is very inconsistent so looking for keywords like "BHL" or "UWI" is not option. <br />

I've made the following VBA function but its not working as expected

Function simpleCellRegex(Myrange As Range) As String
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
    Dim strOutput As String
    
    
    strPattern = &quot;[0-9]{3}/[A-Za-z]-[0-9]{3}-[A-Za-z]/[0-9]{3}-[A-Za-z]-[0-9]{2}/[0-9]{2}&quot;
    
    If strPattern &lt;&gt; &quot;&quot; Then
        strInput = Myrange.Value
        strReplace = &quot;&quot;
        
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .pattern = strPattern
        End With
        
        If regEx.Test(strInput) Then
            simpleCellRegex = regEx.Replace(strInput, strReplace)
        Else
            simpleCellRegex = &quot;Not matched&quot;
        End If
    End If
End Function

答案1

得分: 3

仅提供非正则表达式/VBA的解答;该单元格内容的其他部分是否会遵循正斜杠和连字符的这种模式的几率是多少?换句话说,以下方法是否有效?

[![输入图像描述][1]][1]

B1单元格中的公式:

=MID(A1:A3,SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,A1:A3),23)

如果这种安全性不够,您仍然可以在不使用正则表达式或VBA的情况下执行此操作:

=LET(x,MID(A1,SEQUENCE(LEN(A1)),23),FILTER(x,MAP(x,LAMBDA(y,LET(z,MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}),AND(AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))),AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),&quot;abcdefghijklmnopqrstuvwxyz&quot;))),ISNUMBER(SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,y)))))),&quot;未找到匹配项&quot;))

简要说明如下:

  • MID(A1,SEQUENCE(LEN(A1)),23) - 获取输入的任何可能的23个字符长的子串;

  • MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}) - 创建一个后续子串的数组。这将获取所有字母数字部分;

  • AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))) - 测试所列列是否全部为数字;

  • AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),&quot;abcdefghijklmnopqrstuvwxyz&quot;))) - 测试所列列是否都在字母表中(不区分大小写);

  • ISNUMBER(SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,y)) - 测试所有正斜杠和连字符是否处于正确位置;

  • FILTER()现在将使用外部的AND()返回所有可能的匹配项(而不只是一个)。

英文:

Just to give you a non-regex/VBA answer; what are the odds that any other part of the cell's content would follow this pattern of forward slashes and hyphens? In other words, would the following work?

Excel正则表达式VBA

Formula in B1:

=MID(A1:A3,SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,A1:A3),23)

If that type of security isn't enough, you can still do this without regular expressions or VBA:

=LET(x,MID(A1,SEQUENCE(LEN(A1)),23),FILTER(x,MAP(x,LAMBDA(y,LET(z,MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}),AND(AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))),AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),&quot;abcdefghijklmnopqrstuvwxyz&quot;))),ISNUMBER(SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,y)))))),&quot;No Matches&quot;))

The breakdown in short:

  • MID(A1,SEQUENCE(LEN(A1)),23) - Get any possible substring in from the input of 23 characters long;

  • MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}) - Create an array of subsequent substring. This will grab all alphanumeric parts;

  • AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))) - Test that the listed columns are all numeric;

  • AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),&quot;abcdefghijklmnopqrstuvwxyz&quot;))) - Test that all listed columns are found in the alphabet (case insensitive);

  • ISNUMBER(SEARCH(&quot;???/?-???-?/???-?-??/??&quot;,y)) - Test for all forward slashes and hyphens to be in the right place;

  • The FILTER() will now use the outer AND() to return all possible matches (not just one).

答案2

得分: 2

你应该使用 regExp.Execute(strInput)(而不是 Replace),并查看返回的 Matches 集合。

这对我有效:

Function simpleCellRegex(v) As String
    Const strPattern = "(\d{3}/[A-Z]-\d{3}-[A-Z]/\d{3}-[A-Z]-\d{2}/\d{2})"
    
    Dim regEx As New RegExp, matches As Object, match As Object, sep As String
    
    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = strPattern
    End With
    
    Set matches = regEx.Execute(v)
    If matches.Count > 0 Then
        For Each match In matches
            simpleCellRegex = simpleCellRegex & sep & match.Value
            sep = "; "
        Next match
    Else
        simpleCellRegex = "未匹配"
    End If
End Function
英文:

You should use regExp.Execute(strInput) (not Replace) and look at the returned Matches collection

This works for me:

Function simpleCellRegex(v) As String
    Const strPattern = &quot;\d{3}/[A-Z]-\d{3}-[A-Z]/\d{3}-[A-Z]-\d{2}/\d{2}&quot;
    
    Dim regEx As New RegExp, matches As Object, match As Object, sep As String
    
    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = strPattern
    End With
    
    Set matches = regEx.Execute(v)
    If matches.Count &gt; 0 Then
        For Each match In matches
            simpleCellRegex = simpleCellRegex &amp; sep &amp; match.Value
            sep = &quot;; &quot;
        Next match
    Else
        simpleCellRegex = &quot;Not matched&quot;
    End If
End Function

答案3

得分: 0

以下是翻译好的内容:

这里是一个参考。

根据您提供的模式,您可以进行匹配,如下所示。

(?i) 切换 不区分大小写 模式,开启。
\d 匹配任何数字,从0到9。

(?i)\d{3}/[a-z]-\d{3}-[a-z]/\d{3}-[a-z]-\d\d/\d\d

输出

200/b-003-A/094-G-08/02
200/c-098-C/094-G-01/00
203/a-044-C/094-G-01/00
英文:

Here is a reference.

You can match the pattern you provided, given the following.

The (?i) toggles case-insensitive mode, on.
And, \d matches any digit, 0 through 9.

(?i)\d{3}/[a-z]-\d{3}-[a-z]/\d{3}-[a-z]-\d\d/\d\d

Output

200/b-003-A/094-G-08/02
200/c-098-C/094-G-01/00
203/a-044-C/094-G-01/00

huangapple
  • 本文由 发表于 2023年6月6日 03:10:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76409368.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定