2020年9月15日 03:18:58go评论107阅读模式

英文:

match Float number inside string with regex in Java

问题

我正试图在Java中使用正则表达式查找特定单词后的浮点数，但只有在该单词和浮点数之间没有任何内容时才能找到它，但我希望即使有空格、其他字符和换行符，也能找到它。

以下是我制作的正则表达式：

(?<=TOTAL)([+-]?([0-9]*[.])?[0-9]+)

示例：

> 69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203-
> SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729
> SEPHORA CINESCOPE
> 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora
> Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019
> 451087 OFFRE 20%ACHATS
> 15.00 MASC
> 16.50
> 3.50
> 3.00 N&#39;2
> 24.00 VPBLA
> 0.00 tnoe 0001* 1eepom TOTAL       EUR 
> 62.00

英文:

I am trying to find a float number after a specific word with regex in java , but I am only getting it when there is nothing between the word and the float number , but I want to get it even there are white spaces any other characters and new lines new lines .

Here the regex that I made :

(?&lt;=TOTAL)([+-]?([0-9]*[.])?[0-9]+)

Example :

> 69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203-
> SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729
> SEPHORA CINESCOPE
> 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora
> Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019
> 451087 OFFRE 20%ACHATS
> 15.00 MASC
> 16.50
> 3.50
> 3.00 N'2
> 24.00 VPBLA
> 0.00 tnoe 0001* 1eepom TOTAL EUR
> 62.00

答案1

得分: 2

使用

\bTOTAL\b[\s\S]*?([+-]?\d*\.?\d+)

参见证明

解释

--------------------------------------------------------------------------------
\b                       单词字符 (\w) 与非单词字符之间的边界
--------------------------------------------------------------------------------
TOTAL                    'TOTAL'
--------------------------------------------------------------------------------
\b                       单词字符 (\w) 与非单词字符之间的边界
--------------------------------------------------------------------------------
[\s\S]*?                 任何字符：空白字符 (\n、\r、\t、\f 和空格)，非空白字符 (\n、\r、\t、\f 和非空格)（0 次或多次，匹配最少的次数）
--------------------------------------------------------------------------------
(                        分组并捕获至 ：
--------------------------------------------------------------------------------
[+-]?                    任何字符：'+'、'-'（可选，匹配最多的次数）
--------------------------------------------------------------------------------
\d*                      数字 (0-9)（0 次或多次，匹配最多的次数）
--------------------------------------------------------------------------------
\.?                      '.'（可选，匹配最多的次数）
--------------------------------------------------------------------------------
\d+                      数字 (0-9)（1 次或多次，匹配最多的次数）
--------------------------------------------------------------------------------
)                         的结尾
Java 代码：
```java
String regex = "\\bTOTAL\\b[\\s\\S]*?([+-]?\\d*\\.?\\d+)";
String string = "69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N'2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
    System.out.println(matcher.group(1));
}
结果： `62.00`
[1]: https://regex101.com/r/0ryQ4h/1

英文:

Use

\bTOTAL\b[\s\S]*?([+-]?\d*\.?\d+)

See proof

Explanation

--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  TOTAL                    &#39;TOTAL&#39;
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  [\s\S]*?                 any character of: whitespace (\n, \r, \t,
                           \f, and &quot; &quot;), non-whitespace (all but \n,
                           \r, \t, \f, and &quot; &quot;) (0 or more times
                           (matching the least amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    [+-]?                    any character of: &#39;+&#39;, &#39;-&#39; (optional
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \d*                      digits (0-9) (0 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
    \.?                      &#39;.&#39; (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of

Java code:

String regex = &quot;\\bTOTAL\\b[\\s\\S]*?([+-]?\\d*\\.?\\d+)&quot;;
String string = &quot;69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N&#39;2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00&quot;;
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
    System.out.println(matcher.group(1));
}

Result: 62.00

答案2

得分: 1

我理解这个问题是你想要从特定单词之后提取第一个浮点数，无论中间有什么内容。
一个非贪婪通配符就可以为你完成这个任务。

(?<=TOTAL).*?([+-]?([0-9]*[.])?[0-9]+)

英文:

I interpret the question as that you want to extract the first float number after a certain word, no matter what is in between.
A non-greedy wildcard will simply do that for you.

(?<=TOTAL).*?([+-]?([0-9]*[.])?[0-9]+)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

匹配Java中字符串内的浮点数数字，使用正则表达式。

问题

答案1

答案2

如何在Java中动态使用实例？

忽略只有嵌套成员的AssertJ。

如何修复solr中的java.lang.OutOfMemoryError: PermGen空间错误？

用Mockito测试抽象类。如何做？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。