英文:
Excel Regex VBA
问题
我需要从Excel单元格中提取一部分字符串,如果它符合以下格式:
"DDD/L-DDD-L/DDD-L-DD/DD",其中D代表任何数字,L代表任何字母,不区分大小写。
以下是一些示例:
单元格内容:"BHL: 200/b-003-A/094-G-08/02"
提取的字符串:"200/b-003-A/094-G-08/02"
单元格内容:"UWI: 200/c-098-C/094-G-01/00"
提取的字符串:"200/c-098-C/094-G-01/00"
单元格内容:"TOTAL SULPHUR (mass%) = 0.013, BHL: 203/a-044-C/094-G-01/00"
提取的字符串:"203/a-044-C/094-G-01/00"
注意:数据格式非常不一致,因此查找关键字如"BHL"或"UWI"不是选项。
我已经创建了以下VBA函数,但它没有按预期工作。
英文:
I need to extract a part of string from an excel cell if it matches the following format:
"DDD/L-DDD-L/DDD-L-DD/DD" where D represents any digit and L represents any letter, upper or lower case.
Here are some examples: <br />
Cell content: "BHL: 200/b-003-A/094-G-08/02" <br />
Extracted string: "200/b-003-A/094-G-08/02" <br />
Cell content: "UWI: 200/c-098-C/094-G-01/00" <br />
Extracted string: "200/c-098-C/094-G-01/00" <br />
Cell content: "TOTAL SULPHUR (mass%) = 0.013, BHL: 203/a-044-C/094-G-01/00" <br />
Extracted string: "203/a-044-C/094-G-01/00" <br />
NOTE: The data formatting is very inconsistent so looking for keywords like "BHL" or "UWI" is not option. <br />
I've made the following VBA function but its not working as expected
Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "[0-9]{3}/[A-Za-z]-[0-9]{3}-[A-Za-z]/[0-9]{3}-[A-Za-z]-[0-9]{2}/[0-9]{2}"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = strPattern
End With
If regEx.Test(strInput) Then
simpleCellRegex = regEx.Replace(strInput, strReplace)
Else
simpleCellRegex = "Not matched"
End If
End If
End Function
答案1
得分: 3
仅提供非正则表达式/VBA的解答;该单元格内容的其他部分是否会遵循正斜杠和连字符的这种模式的几率是多少?换句话说,以下方法是否有效?
[![输入图像描述][1]][1]
B1单元格中的公式:
=MID(A1:A3,SEARCH("???/?-???-?/???-?-??/??",A1:A3),23)
如果这种安全性不够,您仍然可以在不使用正则表达式或VBA的情况下执行此操作:
=LET(x,MID(A1,SEQUENCE(LEN(A1)),23),FILTER(x,MAP(x,LAMBDA(y,LET(z,MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}),AND(AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))),AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),"abcdefghijklmnopqrstuvwxyz"))),ISNUMBER(SEARCH("???/?-???-?/???-?-??/??",y)))))),"未找到匹配项"))
简要说明如下:
-
MID(A1,SEQUENCE(LEN(A1)),23)
- 获取输入的任何可能的23个字符长的子串; -
MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2})
- 创建一个后续子串的数组。这将获取所有字母数字部分; -
AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8})))
- 测试所列列是否全部为数字; -
AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),"abcdefghijklmnopqrstuvwxyz")))
- 测试所列列是否都在字母表中(不区分大小写); -
ISNUMBER(SEARCH("???/?-???-?/???-?-??/??",y))
- 测试所有正斜杠和连字符是否处于正确位置; -
FILTER()
现在将使用外部的AND()
返回所有可能的匹配项(而不只是一个)。
英文:
Just to give you a non-regex/VBA answer; what are the odds that any other part of the cell's content would follow this pattern of forward slashes and hyphens? In other words, would the following work?
Formula in B1
:
=MID(A1:A3,SEARCH("???/?-???-?/???-?-??/??",A1:A3),23)
If that type of security isn't enough, you can still do this without regular expressions or VBA:
=LET(x,MID(A1,SEQUENCE(LEN(A1)),23),FILTER(x,MAP(x,LAMBDA(y,LET(z,MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2}),AND(AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8}))),AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),"abcdefghijklmnopqrstuvwxyz"))),ISNUMBER(SEARCH("???/?-???-?/???-?-??/??",y)))))),"No Matches"))
The breakdown in short:
-
MID(A1,SEQUENCE(LEN(A1)),23)
- Get any possible substring in from the input of 23 characters long; -
MID(y,{1,5,7,11,13,17,19,22},{3,1,3,1,3,1,2,2})
- Create an array of subsequent substring. This will grab all alphanumeric parts; -
AND(ISNUMBER(--CHOOSECOLS(z,{1,3,5,7,8})))
- Test that the listed columns are all numeric; -
AND(ISNUMBER(SEARCH(CHOOSECOLS(z,{2,4,6}),"abcdefghijklmnopqrstuvwxyz")))
- Test that all listed columns are found in the alphabet (case insensitive); -
ISNUMBER(SEARCH("???/?-???-?/???-?-??/??",y))
- Test for all forward slashes and hyphens to be in the right place; -
The
FILTER()
will now use the outerAND()
to return all possible matches (not just one).
答案2
得分: 2
你应该使用 regExp.Execute(strInput)
(而不是 Replace
),并查看返回的 Matches
集合。
这对我有效:
Function simpleCellRegex(v) As String
Const strPattern = "(\d{3}/[A-Z]-\d{3}-[A-Z]/\d{3}-[A-Z]-\d{2}/\d{2})"
Dim regEx As New RegExp, matches As Object, match As Object, sep As String
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
Set matches = regEx.Execute(v)
If matches.Count > 0 Then
For Each match In matches
simpleCellRegex = simpleCellRegex & sep & match.Value
sep = "; "
Next match
Else
simpleCellRegex = "未匹配"
End If
End Function
英文:
You should use regExp.Execute(strInput)
(not Replace
) and look at the returned Matches
collection
This works for me:
Function simpleCellRegex(v) As String
Const strPattern = "\d{3}/[A-Z]-\d{3}-[A-Z]/\d{3}-[A-Z]-\d{2}/\d{2}"
Dim regEx As New RegExp, matches As Object, match As Object, sep As String
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
Set matches = regEx.Execute(v)
If matches.Count > 0 Then
For Each match In matches
simpleCellRegex = simpleCellRegex & sep & match.Value
sep = "; "
Next match
Else
simpleCellRegex = "Not matched"
End If
End Function
答案3
得分: 0
以下是翻译好的内容:
这里是一个参考。
根据您提供的模式,您可以进行匹配,如下所示。
(?i) 切换 不区分大小写 模式,开启。
而 \d 匹配任何数字,从0到9。
(?i)\d{3}/[a-z]-\d{3}-[a-z]/\d{3}-[a-z]-\d\d/\d\d
输出
200/b-003-A/094-G-08/02
200/c-098-C/094-G-01/00
203/a-044-C/094-G-01/00
英文:
Here is a reference.
You can match the pattern you provided, given the following.
The (?i) toggles case-insensitive mode, on.
And, \d matches any digit, 0 through 9.
(?i)\d{3}/[a-z]-\d{3}-[a-z]/\d{3}-[a-z]-\d\d/\d\d
Output
200/b-003-A/094-G-08/02
200/c-098-C/094-G-01/00
203/a-044-C/094-G-01/00
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论