如何匹配特殊字符,例如 ^ 和 \,而不使用 indexOf()?

huangapple go评论75阅读模式
英文:

How to match special characters such as ^ and \ without using indexOf()?

问题

我正在尝试匹配一个字符串,以确定它是否是我的程序中的有效“单元”。

有效的单元包括:

至少一个字母,可选的 / 或 -,如果选择了 / 或 -,至少再加一个字母

当前尝试:str.matches("^[a-zA-Z]+[-\\\\/]?(?:[a-zA-Z]+)?$")

示例:

有效 -> abc/abc
有效 -> abc
无效 -> abc^abc
无效 -> abc/abc

无效的情况显示为有效。

英文:

I am trying to match a string if it is a valid "unit" in my program.

Valid units consists of:

At least one letter, optional / or -, at least one more letter if you did the optional / or -

Current attempt: str.matches("[a-zA-z]*[-\\/]?[a-zA-z]*")

Example:

Valid -> abc/abc
Valid -> abc
Invalid -> abc^abc
Invalid -> abc/abc

The invalid cases are showing as valid.

答案1

得分: 1

尝试这个:

str.matches("(?i)[a-z]+([-/][a-z]+)?")

注:

  • (?i) 开启了大小写不敏感匹配,因此你可以用 [a-z] 代替 [a-zA-Z],这样更容易阅读。
  • + 表示“一次或多次”。
  • 你无需转义 /。它在正则表达式中没有特殊含义,只是一个普通的字符。
  • (...)? 表示括号中的内容是可选的。这就是如何将“短横线/斜杠加字母(们)”的表达式作为整体设置为可选。
英文:

Try this:

str.matches("(?i)[a-z]+([-/][a-z]+)?")

Notes:

  • (?i) switchs on case insensitive matching, so you can code [a-z] instead of [a-zA-Z], so it's easier to read
  • + means "one or more"
  • you do not ever need to escape /. It has no special regex meaning - it's just a plain old ordinary character
  • (...)? means everythinig in the brackets is optional. That's how you make the "dash/slash plus letter(s)" expression optional as a whole

答案2

得分: 0

尝试以下正则表达式:

"[a-zA-z]+[-|\\/]{1}[a-zA-z]+"
英文:

Try the following Regex:

"[a-zA-z]+[-|\\/]{1}[a-zA-z]+"

答案3

得分: 0

尝试这个代码,寻找只在其两侧至少有一个字符的情况下出现的唯一一个 '-' 或 '/' 字符:

str.matches("[a-zA-z]{1,}[-\\/]{1}[a-zA-z]{1,}")
英文:

Try this to look for one and only one '-' or '/' character that is between at least one character on each side of it:

str.matches("[a-zA-z]{1,}[-\\/]{1}[a-zA-z]{1,}")

答案4

得分: 0

答案对更新的问题的回答(版本3):

str.matches("[a-zA-Z]+(?:[-/][a-zA-Z]+)?")

解释

> 至少一个字母

[a-zA-Z]+

> 可选... 如果你选择了可选的 /-

(?:X)?(X)?

两者都可以,但第一个是非捕获的,因为你不需要捕获,所以更合适。

> /-

[-/]

> 至少再加一个字母

[a-zA-Z]+

测试

public static void main(String[] args) {
    test("abc/abc", "abc", "abc^abc", "abc\\abc");
}
static void test(String... inputs) {
    for (String input : inputs)
        System.out.println((input.matches("[a-zA-Z]+(?:[-/][a-zA-Z]+)?") ? "Valid" : "Invalid") + " -> " + input);
}

输出

Valid -> abc/abc
Valid -> abc
Invalid -> abc^abc
Invalid -> abc\abc

来自评论

> 这样更好,但是当我尝试使用 abc\abcabc^abc 作为字符串时失败了。我认为这与正则表达式中使用的这些字符有关。

嗯,根据列出的规则,这些都应该失败。

如果你的意思是想要修改规则,以允许 \^,那么你需要将它们添加到 [ ] 内。

但是,你需要注意 [ ] 内的特殊字符:

  • - 是特殊字符,表示一个范围,例如 a-z,但如果在首位之外则不是特殊字符。如果不在首位,则用 \ 转义。在 Java 字符串文字中是 \ 的两倍,写成 \\

  • ^ 在首位时是特殊字符,表示反转列表,例如 [^a-z] 表示“不是小写字母”。

  • \ 总是特殊字符,因为它是转义字符,所以你需要用 \\ 转义,如果在 Java 字符串文字中是两倍的 \\\\

这意味着:

str.matches("[a-zA-Z]+(?:[-/^\\\\][a-zA-Z]+)?")

测试输出

Valid -> abc/abc
Valid -> abc
Valid -> abc^abc
Valid -> abc\abc
Invalid -> abc#abc
英文:

Answer to updated question (version 3):

str.matches("[a-zA-Z]+(?:[-/][a-zA-Z]+)?")

Explanation

> At least one letter

[a-zA-Z]+

> optional ... if you did the optional / or -

(?:X)? or (X)?

Both work, but the first is non-capturing, and since you don't need capturing, it is the more appropriate one to use.

> / or -

[-/]

> at least one more letter

[a-zA-Z]+

Test

public static void main(String[] args) {
    test("abc/abc", "abc", "abc^abc", "abc\\abc");
}
static void test(String... inputs) {
    for (String input : inputs)
        System.out.println((input.matches("[a-zA-Z]+(?:[-/][a-zA-Z]+)?") ? "Valid" : "Invalid") + " -> " + input);
}

Output

Valid -> abc/abc
Valid -> abc
Invalid -> abc^abc
Invalid -> abc\abc

From comment:

> That's better, but it's failing when I try: abc\abc or abc^abc as the string. I think it has to do with those characters being used by regex as well.

Well, those are supposed to fail according to the listed rule.

If you meant that you want to modify the rule to also allow \ and ^, then you need to add them inside the [ ].

But, you need to be aware of special characters inside [ ]:

  • - is special, indicating a range, e.g. a-z, but not special if first or last. Escape with \ if not first or last. Double the \ when in a Java string literal to \\.

  • ^ is special if first, meaning to reverse the list, e.g. [^a-z] means "not a lowercase letter`.

  • \ is always special, since it is the escape character, so you need to escape it as \\, and doubling them when in a Java string literal to \\\\.

Which means:

str.matches("[a-zA-Z]+(?:[-/^\\\\][a-zA-Z]+)?")

Test Output

Valid -> abc/abc
Valid -> abc
Valid -> abc^abc
Valid -> abc\abc
Invalid -> abc#abc

huangapple
  • 本文由 发表于 2020年4月8日 05:57:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/61090023.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定