正则表达式提取多个单词组合之间的数字。

huangapple go评论77阅读模式
英文:

Regex Extract number between multiple words combination

问题

需要根据多个词(LandLine|Mobile)从以下输入中提取手机号码。我无法提取所有3个号码。需要在给定词组之前和之后读取数字。请协助

词组:(LandLine|Mobile)

字符串行 = "我是Joe,我的座机号码是987654321,另一个号码123456789是我的手机,妻子的手机号码是776655881";

字符串模式 = "(Mobile|LandLine)([^\\d]*)(\\d{9})|"  //正向读取
            + "(\\d{9})([^\\d]*)(Mobile|LandLine)";  //反向读取

模式 r = Pattern.compile(字符串模式, Pattern.CASE_INSENSITIVE);

匹配器 matcher = r.matcher(字符串行);
while (matcher.find()) {
    System.out.println(字符串行.substring(matcher.start(), matcher.end()));
}
代码输出:
座机号码是987654321
123456789是我的手机和妻子手机
期望输出:
座机号码是987654321
123456789是我的手机
手机号码是776655881
英文:

Need to extract mobile numbers based on multiple words(LandLine|Mobile) scan from the below input. I am not able to extract all the 3 numbers. Need to read the number before and after the given words combination .Please assist

Words: (LandLine|Mobile)

	String line = "i'm Joe my LandLine number is 987654321, another number 123456789 is my Mobile and wife Mobile number is 776655881";
    		
	String pattern = "(Mobile|LandLine)([^\\d]*)(\\d{9})|"  //Forward read
					+"(\\d{9})([^\\d]*)(Mobile|LandLine)";  //Backward read
    
    Pattern r = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);

    Matcher matcher = r.matcher(line);
    while(matcher.find()) {
        System.out.println(line.substring(matcher.start(), matcher.end()));
        
    }
Code Output:
LandLine number is 987654321
123456789 is my Mobile and wife Mobile
Expected Output:
LandLine number is 987654321
123456789 is my Mobile
Mobile number is 776655881

答案1

得分: 2

以下是翻译好的内容:

这个模式 "(LandLine|Mobile)\\D*\\d{9}|\\d{9}.*?(LandLine|Mobile)" 似乎符合要求:

import java.util.Arrays;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;

class Main {
    public static void main(String[] args) {
        var line = "我是 Joe,我的座机号码是 987654321,另一个号码 123456789 是我的手机,妻子的手机号码是 776655881";
        var pattern = "(LandLine|Mobile)\\D*\\d{9}|\\d{9}.*?(LandLine|Mobile)";
        var res = Pattern
            .compile(pattern)
            .matcher(line)
            .results()
            .map(MatchResult::group)
            .toArray(String[]::new);
        System.out.println(Arrays.toString(res));
    }
}

输出:

<!-- language: none -->
    
    [座机号码是 987654321, 123456789 是我的手机, 手机号码是 776655881]
    
这在 `.*?` 后面添加了惰性量词 `?`,同时进行了一些较小的语义优化,如使用 `\\D` 代替 `[^\\d]`。
英文:

The pattern &quot;(LandLine|Mobile)\\D*\\d{9}|\\d{9}.*?(LandLine|Mobile)&quot; seems to fit the bill:

import java.util.Arrays;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;

class Main {
    public static void main(String[] args) {
        var line = &quot;i&#39;m Joe my LandLine number is 987654321, another number 123456789 is my Mobile and wife Mobile number is 776655881&quot;;
        var pattern = &quot;(LandLine|Mobile)\\D*\\d{9}|\\d{9}.*?(LandLine|Mobile)&quot;;
        var res = Pattern
            .compile(pattern)
            .matcher(line)
            .results()
            .map(MatchResult::group)
            .toArray(String[]::new);
        System.out.println(Arrays.toString(res));
    }
}

Output:

<!-- language: none -->

[LandLine number is 987654321, 123456789 is my Mobile, Mobile number is 776655881]

This adds a lazy quantifier ? to .*? along with some minor semantic optimizations like \\D instead of [^\\d].

huangapple
  • 本文由 发表于 2020年9月28日 11:19:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/64095521.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定