String.matches() 和正则表达式的正确用法

huangapple go评论81阅读模式
英文:

correct usage of String.matches() and regex

问题

以下是翻译好的内容:

我试图检查我的输入是否包含除了 A-Z a-z , . ' - 和空格 之外的任何内容。
我猜这是一个简单的错误,因为在正则表达式方面我是一个新手。

public class Test {
  public static void main(String args[]) {
	  doesMatch(0,"Hello ', . - ");
	  doesMatch(1,"1Hello1");
	  doesMatch(2,"23123");
	  doesMatch(3,"§!$'##");
	  doesMatch(4,"pe33teramjd");
	  doesMatch(5,"3pe33teramjd");
	  doesMatch(6,"pe33teramjd3");
	  doesMatch(7,"yup py");
    }
  
  static void doesMatch(int number,String input){
	  System.out.println("Number: "+number+" | "+input.matches("[^A-Za-z,.\\s'-]"));
  }
}

输出:

Number: 0 | false
Number: 1 | false
Number: 2 | false
Number: 3 | false
Number: 4 | false
Number: 5 | false
Number: 6 | false
Number: 7 | false

期望的输出:

Number: 0 | false
Number: 1 | true
Number: 2 | true
Number: 3 | true
Number: 4 | true
Number: 5 | true
Number: 6 | true
Number: 7 | false
英文:

Im trying to check if my input contains anything but A-Z a-z , . ' - and whitespace.
I guess it's a simple mistake because im quite a rookie when it comes to regex.

public class Test {
  public static void main(String args[]) {
	  doesMatch(0,"Hello ', . - ");
	  doesMatch(1,"1Hello1");
	  doesMatch(2,"23123");
	  doesMatch(3,"§!$'##");
	  doesMatch(4,"pe33teramjd");
	  doesMatch(5,"3pe33teramjd");
	  doesMatch(6,"pe33teramjd3");
	  doesMatch(7,"yup py");
    }
  
  static void doesMatch(int number,String input){
	  System.out.println("Number: "+number+" | "+input.matches("[^A-Za-z,.'\\s-]"));
  }
}

output:

Number: 0 | false
Number: 1 | false
Number: 2 | false
Number: 3 | false
Number: 4 | false
Number: 5 | false
Number: 6 | false
Number: 7 | false

desired output:

Number: 0 | false
Number: 1 | true
Number: 2 | true
Number: 3 | true
Number: 4 | true
Number: 5 | true
Number: 6 | true
Number: 7 | false

答案1

得分: 1

Explanation

我试图检查我的输入是否包含除 A-Z a-z , . ' - 和空白 之外的任何内容。

或者,反过来的逻辑:您试图验证文本是否仅由 A-Z a-z , . ' - 和空白 组成。

您可能意图重复使用正则表达式模式。因此,应该使用 [...]+ 而不仅是 [...],后者只会匹配单个字符。

然后,去掉 ^,因为它会否定表达式。另外,您必须转义 .,变成 \\.,否则它会匹配任何字符,而不是 .(点号)本身。

现在的正则表达式模式是 "[A-Za-z,\\.'\\s-]+"。现在您将收到以下输出:

编号:0 | true
编号:1 | false
编号:2 | false
编号:3 | false
编号:4 | false
编号:5 | false
编号:6 | false
编号:7 | true

只需用 ! 对布尔值取反,就完成了。


供参考的完整代码:

public class Test {
  public static void main(String args[]) {
      doesMatch(0,"Hello ', . - ");
      doesMatch(1,"1Hello1");
      doesMatch(2,"23123");
      doesMatch(3,"§!$'##");
      doesMatch(4,"pe33teramjd");
      doesMatch(5,"3pe33teramjd");
      doesMatch(6,"pe33teramjd3");
      doesMatch(7,"yup py");
    }
  
  static void doesMatch(int number, String input){
      System.out.println("编号:" + number + " | " + !input.matches("[A-Za-z,\\.'\\s-]+"));
  }
}

没有取反的情况

如果您不想用 ! 对最终结果取反,您必须正确否定正则表达式模式,这个模式在非正式情况下是 “如果任何字符不是 A-Z a-z , . ' - 或空白,则它匹配”

一个检查的模式可能是:

".*[^A-Za-z,\\.'\\s-].*";

.* 表示任何字符序列。


使用搜索而不是完全匹配

与其尝试进行完全匹配,更好的方法是使用 find() 进行正则表达式搜索:

Pattern pattern = Pattern.compile("[^A-Za-z,\\.'\\s-]");
...
boolean result = pattern.matcher(input).find();

现在这将告诉您是否找到了与正则表达式匹配的任何内容,即任何不是 A-Za-z,\\.'\\s- 的字符。

英文:

Explanation

> Im trying to check if my input contains anything but A-Z a-z , . ' - and whitespace.

Or, the reverse logic: You are trying to verify whether the text only consists of A-Z a-z , . ' - and whitespace.

You likely intended to repeat your regex pattern. So [...]+ instead of just [...], which would match a single character only.

Then, get rid of the ^ which would negate the expression. Also, you have to escape the ., so \\., otherwise it matches any character and not the . (dot).

The regex pattern now is "[A-Za-z,\\.'\\s-]+". Now you receive the following output:

Number: 0 | true
Number: 1 | false
Number: 2 | false
Number: 3 | false
Number: 4 | false
Number: 5 | false
Number: 6 | false
Number: 7 | true

Just negate the boolean with ! and you are done.


For reference, the full code:

public class Test {
  public static void main(String args[]) {
      doesMatch(0,"Hello ', . - ");
      doesMatch(1,"1Hello1");
      doesMatch(2,"23123");
      doesMatch(3,"§!$'##");
      doesMatch(4,"pe33teramjd");
      doesMatch(5,"3pe33teramjd");
      doesMatch(6,"pe33teramjd3");
      doesMatch(7,"yup py");
    }
  
  static void doesMatch(int number,String input){
      System.out.println("Number: "+number+" | "+ !input.matches("[A-Za-z,\\.'\\s-]+"));
  }
}

Without negation

If you do not want to negate the final result with ! you have to correctly negate the regex pattern, which would informally be "if any character is not A-Z a-z , . ' - or whitespace then it matches"

A pattern checking that could be

".*[^A-Za-z,\\.'\\s-].*"

The .* means any character sequence.


Search instead of full match

Instead of attempting a full-match, this is likely better done by having a regex search using find():

Pattern pattern = Pattern.compile("[^A-Za-z,\\.'\\s-]");
...
boolean result = pattern.matcher(input).find();

This will now tell you whether it found anything matching the regex, so any character that is not A-Za-z,\\.'\\s-.

huangapple
  • 本文由 发表于 2020年8月25日 19:23:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/63577777.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定