英文:
match Float number inside string with regex in Java
问题
我正试图在Java中使用正则表达式查找特定单词后的浮点数,但只有在该单词和浮点数之间没有任何内容时才能找到它,但我希望即使有空格、其他字符和换行符,也能找到它。
以下是我制作的正则表达式:
(?<=TOTAL)([+-]?([0-9]*[.])?[0-9]+)
示例:
> 69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203-
> SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729
> SEPHORA CINESCOPE
> 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora
> Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019
> 451087 OFFRE 20%ACHATS
> 15.00 MASC
> 16.50
> 3.50
> 3.00 N'2
> 24.00 VPBLA
> 0.00 tnoe 0001* 1eepom TOTAL EUR
> 62.00
英文:
I am trying to find a float number after a specific word with regex in java , but I am only getting it when there is nothing between the word and the float number , but I want to get it even there are white spaces any other characters and new lines new lines .
Here the regex that I made :
(?<=TOTAL)([+-]?([0-9]*[.])?[0-9]+)
Example :
> 69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203-
> SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729
> SEPHORA CINESCOPE
> 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora
> Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019
> 451087 OFFRE 20%ACHATS
> 15.00 MASC
> 16.50
> 3.50
> 3.00 N'2
> 24.00 VPBLA
> 0.00 tnoe 0001* 1eepom TOTAL EUR
> 62.00
答案1
得分: 2
使用
\bTOTAL\b[\s\S]*?([+-]?\d*\.?\d+)
参见证明
解释
--------------------------------------------------------------------------------
\b 单词字符 (\w) 与非单词字符之间的边界
--------------------------------------------------------------------------------
TOTAL 'TOTAL'
--------------------------------------------------------------------------------
\b 单词字符 (\w) 与非单词字符之间的边界
--------------------------------------------------------------------------------
[\s\S]*? 任何字符:空白字符 (\n、\r、\t、\f 和空格),非空白字符 (\n、\r、\t、\f 和非空格)(0 次或多次,匹配最少的次数)
--------------------------------------------------------------------------------
( 分组并捕获至 :
--------------------------------------------------------------------------------
[+-]? 任何字符:'+'、'-'(可选,匹配最多的次数)
--------------------------------------------------------------------------------
\d* 数字 (0-9)(0 次或多次,匹配最多的次数)
--------------------------------------------------------------------------------
\.? '.'(可选,匹配最多的次数)
--------------------------------------------------------------------------------
\d+ 数字 (0-9)(1 次或多次,匹配最多的次数)
--------------------------------------------------------------------------------
) 的结尾
Java 代码:
```java
String regex = "\\bTOTAL\\b[\\s\\S]*?([+-]?\\d*\\.?\\d+)";
String string = "69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N'2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
结果: `62.00`
[1]: https://regex101.com/r/0ryQ4h/1
英文:
Use
\bTOTAL\b[\s\S]*?([+-]?\d*\.?\d+)
See proof
Explanation
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
TOTAL 'TOTAL'
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
[\s\S]*? any character of: whitespace (\n, \r, \t,
\f, and " "), non-whitespace (all but \n,
\r, \t, \f, and " ") (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
[+-]? any character of: '+', '-' (optional
(matching the most amount possible))
--------------------------------------------------------------------------------
\d* digits (0-9) (0 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\.? '.' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of
Java code:
String regex = "\\bTOTAL\\b[\\s\\S]*?([+-]?\\d*\\.?\\d+)";
String string = "69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N'2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Result: 62.00
答案2
得分: 1
我理解这个问题是你想要从特定单词之后提取第一个浮点数,无论中间有什么内容。
一个非贪婪通配符就可以为你完成这个任务。
(?<=TOTAL).*?([+-]?([0-9]*[.])?[0-9]+)
英文:
I interpret the question as that you want to extract the first float number after a certain word, no matter what is in between.
A non-greedy wildcard will simply do that for you.
(?<=TOTAL).*?([+-]?([0-9]*[.])?[0-9]+)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论