英文:
Extract one numbers from string with condition
问题
I will provide the translation of the code portion as requested:
Public Function GetNumeric(CellRef As String)
Dim StringLength As Long, i As Long, Result As Variant
StringLength = Len(CellRef)
For i = 1 To StringLength
If IsNumeric(Mid(CellRef, i, 1)) Then
Result = Result & Mid(CellRef, i, 1)
End If
Next i
GetNumeric = Result
End Function
Please note that I won't provide translations for the non-code content.
英文:
the below code extract all numbers from string and even combine them.
But I need to extract only one whole number with rules:
1- the number is one or two digits (plus the decimal part if it exsists).
2- if the number is followed by "
or inch
or in
, then extract it and ignore rest of numbers in string.
3- if the above condition (2) is not found, then extract the first numbers and ignore rest of numbers in string.
Current String | Expected Result |
---|---|
INSPECT-8'' Water 12 Pipe | 8 |
INSPECT- 8.5" Water 12 | 8.5 |
INSPECT- 4 Water 5.5" | 5.5 |
PM- 6.5 inch From H44 | 6.5 |
PM-36in Pipe M1T | 36 |
PM-36 Pipe From M2T | 36 |
PM-18"*12" Pipe From M1T | 18 |
PM-36 From 5" M1T | 5 |
PM-123 Pipe From MT |
Public Function GetNumeric(CellRef As String)
Dim StringLength As Long, i As Long, Result As Variant
StringLength = Len(CellRef)
For i = 1 To StringLength
If IsNumeric(Mid(CellRef, i, 1)) Then
Result = Result & Mid(CellRef, i, 1)
End If
Next i
GetNumeric = Result
End Function
答案1
得分: 6
以下是您要翻译的代码部分:
Public Function RegexExtract(str, pat, Optional gFlag As Boolean = False, Optional pos As Integer = 0, Optional cse as Boolean = True) As String
Static RE As Object: If RE Is Nothing Then Set RE = CreateObject("vbscript.regexp")
RE.Pattern = pat
RE.Global = gFlag
RE.IgnoreCase = cse
If RE.Test(str) Then
RegexExtract = RE.Execute(str)(pos)
Else
RegexExtract = vbNullString
End If
End Function
=IFERROR(--RegexExtract(A1,"\b\d\d?(?!\d)(?=\s*(?:""|''|in(?:ch)?\b)?)),"")
\b\d\d? - 一个字边界,后面跟着1个数字和一个可选的第二个数字;
(?!\d) - 一个负向先行断言,用于确认后面没有更多的数字;
(?=\s*(?:""|''|in(?:ch)?\b)?) - 一个正向先行断言,用于确认位置后面跟着0个或多个(贪婪)空白字符和:
* "" - 一个双引号,或;
* '' - 两个单引号,或;
* in(?:ch)?\b - 字面上的'in',后面跟着可选的'ch'和一个字边界,以确保这些字母不是较大子字符串的一部分,以防误报。
如果您需要更多翻译,请告诉我。
英文:
Maybe create your own UDF making use of a regular expression. Perhaps something like:
Public Function RegexExtract(str, pat, Optional gFlag As Boolean = False, Optional pos As Integer = 0, Optional cse as Boolean = True) As String
Static RE As Object: If RE Is Nothing Then Set RE = CreateObject("vbscript.regexp")
RE.Pattern = pat
RE.Global = gFlag
RE.IgnoreCase = cse
If RE.Test(str) Then
RegexExtract = RE.Execute(str)(pos)
Else
RegexExtract = vbNullString
End If
End Function
Note that I created an optional global flag that is false by default which should just pull the very 1st hit in a cell. The optional pos variable is there to return a certain match in case you wish the somehow return other numbers when you set the global flag to true. Also note the use of a case flag set to true to match case-insensitive by default.
You can call the above like:
=IFERROR(--RegexExtract(A1,"\b\d\d?(?!\d)(?=\s*(?:""|''|in(?:ch)?\b)?)"),"")
The pattern used stands for:
\b\d\d?
- A word-boundary with 1 digit and a 2nd optional one;(?!\d)
- A negative lookahead to assert no more digits;(?=\s*(?:"|''|in(?:ch)?\b)?)
- A positive lookahead to assert position is followed by 0+ (greedy) whitespace characters and:"
- A double quote, or;''
- Two single quotes, or;in(?:ch)?\b
- Literally 'in' followed by optional 'ch' and a word-boundary to confirm the letters are not part of a larger substring to prevent false positives.
EDIT1:
As per OP's comments below; there are case where there could be a number of interest that is not at the 1st position. Since OP allows for a number without inches to also be matched, the addition here is to include a negative lookahead that will assert that there is no 2nd occurence of the valid pattern:
\b\d\d?(?!\d|.*\b\d+\s*(?:""|''|in(?:ch)?\b))(?=\s*(?:""|''|in(?:ch)?\b)?)
I suppose this is implicitly the same as:
\b\d\d?(?!\d|.*\b\d+\s*(?:""|''|in(?:ch)?\b))
EDIT2:
To allow for decimals you could include an optional non-capture group:
=IFERROR(--RegexExtract(A2,"\b\d\d?(?:\.\d+)?(?!\d|.*\b\d+\s*(?:""|''|in(?:ch)?\b))"),"")
答案2
得分: 2
我认为答案已经包含在您提出的要求中。
您需要添加if语句作为检查来执行此操作,或者使用case语句。我建议在将其添加到结果之前定义一个额外的变量进行检查。将其定义为您正在循环遍历的字符串的字符。示例:
Dim Check as String
Check = Mid(CellRef, i, 1)
If Check = Chr(34) then GetNumeric = Result
处理多个数字而没有进一步定义将产生不可解决的情况,例如,您倒数第二个示例没有指示1或36是否是正确的一个。
英文:
I think the answer already lays in your proposed requirements
You need to added if statement as checks to do this or a case statement, I recommend defining an extra variable to check before you added to results. Define this as the character of your string you are looping through. Example:
Dim Check as string
Check = Mid(CellRef, i, 1)
If Check = Chr(34) then GetNumeric = Result
Dealing with multiple numbers without further definition will yield situations that are not solvable e.g you second to last example has no indication if the 1 or 36 is at the correct one.
答案3
得分: 2
以下是翻译好的部分:
这里有两种不需要使用正则表达式的解决方案。
第一种解决方案对字符串应用一系列转换,以便我们可以使用Split函数获取一系列字符串,其中一些将是数字。问题在于选择正确的转换,以便可以应用Split函数来分离数字。有时这可能是不可能的。
第二种解决方案只是解析字符串,直到提取了一系列数字字符,然后返回该数字字符串以供进一步处理。这可能是您的情况中最好的解决方案。
请注意,这两种解决方案都未经过边缘情况的测试。
考虑到您试图解析的似乎是自由文本,可能会有许多边缘情况。
英文:
Here are two solutions which do not require Regex.
The first solution applies a series of transforms to the string so that we can use Split to get a sequence of strings, some of which will be numbers. The issue here is choosing the correct transforms so that Split can be applied to isolate numbers. Sometime this may not be possible.
The second solution just parses the string until it has extracted a sequence of characters that are numeric and then returns that numeric string for further processing. This is likely the best solution in your case.
Note that neither solution has been tested for edge cases.
Given that you are trying to parse what appears to be freeform text there could be lots of edge cases.
Sub Test()
Dim myC As Collection
Set myC = New Collection
With myC
.Add "INSPECT - 8" & Chr$(34) & " Water 12 Pipe 8"
.Add "INSPECT- 18" & Chr$(34) & " Water 12 18"
.Add "PM-6in Pipe From M37 st 6"
.Add "PM- 6 inch Pipe From H44 6"
.Add "PM-36 Pipe From M-1T 36"
.Add "PM-123 Pipe From MT"
End With
Dim myItem As Variant
Dim myNumber As Long
For Each myItem In myC
'Option 1
' If TryGetFirstNumber(myItem, myNumber) Then
' Debug.Print myNumber
' End If
' option 2
' Debug.Print ParseFirstNumber(VBA.CStr(myItem))
Next
End Sub
' Pass ByVal so we don't alter the original string
Public Function ApplyTransforms(ByVal ipString As String) As String
ipString = VBA.LCase(ipString)
ipString = VBA.Replace(ipString, "-", " ")
ipString = VBA.Replace(ipString, VBA.Chr(34), " ")
ipString = VBA.Replace(ipString, "in ", " ")
ipString = VBA.Replace(ipString, "inch ", " ")
ApplyTransforms = ipString
End Function
'The try function indicates success by the returned boolean value, the result of the success is returned Byref in parameter opNumber
Public Function TryGetFirstNumber(ByRef ipString As Variant, ByRef opNumber As Long, Optional ipLength As Long = 2) As Boolean
Dim myArray As Variant
myArray = VBA.Split(ApplyTransforms(ipString))
Dim myItem As Variant
For Each myItem In myArray
If VBA.IsNumeric(myItem) Then
If VBA.Len(myItem) <= ipLength Then
opNumber = VBA.CLng(myItem)
TryGetFirstNumber = True
Exit Function
End If
End If
Next
TryGetFirstNumber = False
End Function
Public Function ParseFirstNumber(ByRef ipString As String) As String
Dim myIndex As Long
myIndex = 1
Dim myLen As Long
myLen = VBA.Len(ipString)
Dim myNumber As String
myNumber = vbNullString
Do While myIndex <= myLen
If VBA.InStr("0123456789", VBA.Mid$(ipString, myIndex, 1)) > 0 Then
Exit Do
End If
myIndex = myIndex + 1
Loop
If myIndex > myLen Then
ParseFirstNumber = myNumber
Exit Function
End If
Do While VBA.InStr("0123456789", VBA.Mid$(ipString, myIndex, 1)) > 0
myNumber = myNumber & VBA.Mid$(ipString, myIndex, 1)
myIndex = myIndex + 1
If myIndex > myLen Then
ParseFirstNumber = myNumber
Exit Function
End If
Loop
ParseFirstNumber = myNumber
End Function
答案4
得分: 2
请,(还要)测试下一种方法。它使用标准的VBA和一个数组来进行处理。处理后的数组内容将在代码末尾删除,因此即使对于大范围,它应该非常快。它假定要处理的范围从“A1”开始,工作表的第一行包含标题:
但是上面的代码只会处理您在问题中显示的模式字符串。例如,如果有更多不同目的的双引号字符(在前面有一个数字),代码只会处理找到的第一个。如果搜索的字符串在字符串中是第一个等等,可能会出现问题。可以根据需要进行调整以处理更多情况,但是我们无法猜测未显示的情况...
英文:
Please, (also) test the next way. It uses standard VBA and an array to be processed. The processed array content will be dropped at the end of the code, so, it should be very fast even for large ranges. It assumes that the range to be processed starts from "A1", headers existing on the sheet first row:
Sub extractInchesNoFromAllRange()
Dim sh As Worksheet, lastR As Long, arr, i As Long
Dim dblQ As Long, sQ As Long, strIn As Long, No As String
Set sh = ActiveSheet
lastR = sh.Range("A" & sh.rows.count).End(xlUp).row
arr = sh.Range("A2:B" & lastR).Value2
For i = 1 To UBound(arr)
dblQ = InStr(arr(i, 1), """") 'check if a double quote caracters exists and return its postition if it does
sQ = InStr(arr(i, 1), "''") 'check if two simple quote caracters exists and return its postition if it does
strIn = InStr(arr(i, 1), "in") 'the same as above for "in" string
No = "" 'reinitialize the variable to keep the extracted number (as string...)
If dblQ > 0 Or sQ > 0 Then 'if doble quote exists:
If IsNumeric(Mid(arr(i, 1), IIf(dblQ > 0, dblQ, sQ) - 1, 1)) Then 'if a number exists before the quote ch
No = Mid(arr(i, 1), IIf(dblQ > 0, dblQ, sQ) - 1, 1) 'extract first digit
arr(i, 2) = extractNo(IIf(dblQ > 0, dblQ, sQ) - 2, CStr(arr(i, 1)), No, True) 'call the function which make extraction by (backward) iteration
End If
ElseIf strIn > 0 Then 'if "in" exists:
If Mid(arr(i, 1), strIn + 2, 1) = " " Or Mid(arr(i, 1), strIn + 2, 2) = "ch" Or strIn + 1 = Len(arr(i, 1)) Then
If Mid(arr(i, 1), strIn - 1, 1) = " " Then
arr(i, 2) = extractNo(strIn - 2, CStr(arr(i, 1)), No, True)
Else
arr(i, 2) = extractNo(strIn - 1, CStr(arr(i, 1)), No, True)
End If
End If
Else
arr(i, 2) = extractNo(0, CStr(arr(i, 1)), "")
End If
Next i
'drop the processed arran content back in its range:
sh.Range("A2").Resize(UBound(arr), UBound(arr, 2)).Value2 = arr
End Sub
Function extractNo(pos As Long, str As String, No As String, Optional boolChar = False) As Variant
Dim i As Long, boolNo As Boolean
On Error GoTo WrongPatt
If boolChar Then 'if one of the searched characters has been found:
For i = pos To 1 Step -1
If IsNumeric(Mid(str, i, 1)) Or Mid(str, i, 1) = "." Then
No = CStr(Mid(str, i, 1)) & No
Else
extractNo = CDbl(No): Exit For
End If
Next i
Else 'if no searched string has been found:
For i = 1 To Len(str)
If IsNumeric(Mid(str, i, 1)) Then
boolNo = True
No = No & Mid(str, i, 1)
Else
If boolNo Then Exit For
End If
Next i
If Len(No) <= 2 And No <> "" Then
extractNo = CLng(No)
Else
extractNo = ""
End If
End If
Exit Function
WrongPatt:
extractNo = "Wrong pttern"
End Function
But the above code will process only the pattern string you show in your question. If, for instance, there will be more double quotes characters, with a different purpose **before the one having a number in front of it), the code will process only the first found. It may have problems if the searched strings are the first in the string and so on... It can be adapted to deal with more conditions, but we here are not the mind readers to cover such not shown cases...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论