正则表达式 – 大写字母后面跟着任何 UTF-8 编码的小写字符

huangapple go评论58阅读模式
英文:

regex - Uppercase letters followed by any utf-8 lowercase characters

问题

private static String pattern = "^\u0049\u004A(\p{Ll})*";
System.out.println(Pattern.compile(pattern).matcher("IJP").find()); // true

I am using this regex but it doesn't seem to work. For "IJP" it should not match as P is uppercase.

英文:

I want to write a regex which is valid if it starts with IJ followed by any lowercase letters in utf-8.

private static String pattern = "^\\u0049\\u004A(\\p{Ll})*";
System.out.println(Pattern.compile(pattern).matcher("IJP").find()); // true

I am using this regex but it doesn't seem to work. For "IJP" it should not match as P is uppercase.

答案1

得分: 1

你的模式应该是:

final String pattern = "^\\u0049\\u004A\\p{Ll}*$";

注意在结尾处放置 $ 以使其在结尾之前匹配 0 个或多个小写字符。请注意,我已经删除了 \p{Ll} 周围不必要的分组。

代码演示:

jshell> String pattern = "^\\u0049\\u004A\\p{Ll}*$";
pattern ==> "^\\u0049\\u004A\\p{Ll}*$"

jshell> Pattern.compile(pattern).matcher("IJP").find();
$6 ==> false
英文:

Your pattern should be:

final String pattern = "^\\u0049\\u004A\\p{Ll}*$";

Note placement of $ in the end to make it 0 or more lowercase characters before end. Note that I have removed unnecessary group around \p{Ll}.

Code Demo:

jshell> String pattern = "^\\u0049\\u004A\\p{Ll}*$";
pattern ==> "^\\u0049\\u004A\\p{Ll}*$"

jshell> Pattern.compile(pattern).matcher("IJP").find();
$6 ==> false

答案2

得分: 0

\\u0049 是表示 I 的一种相当晦涩的方式,你不觉得是吗?为什么不直接写 pattern = "^IJ\\p{Ll}" 呢?

不过 IJP 确实会匹配。想想看。你要求一个 I,然后是一个 J,然后是 0 个或更多小写字母。这就是:一个 I,一个 J,和 0 个小写字母。

你可以使用 matches() 替代 find()(它会问:整个字符串是否与正则表达式完全匹配,而不是'是否存在某个子字符串与之匹配'),或者,既然你已经在那里加入了 ^,可以在末尾加上 $ 进行匹配。

英文:

\\u0049 is a rather obtuse way of writing I, don't you think? Why not just... write pattern = "^IJ\\p{Ll}"?

IJP does match though. Think about it. You're asking for an I, then a J, then 0 or more lowercase letters. Which is right there: An I, a J, and 0 lowercase letters.

Either use matches() instead of find() (which asks: Does the ENTIRE string match the regexp, vs. 'is there some substring that does') or, as you've already thrown the ^ in there, toss a $ at the end to match.

huangapple
  • 本文由 发表于 2020年9月4日 02:54:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/63729991.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定