改进正则表达式以包括匹配连字符对。

huangapple go评论67阅读模式
英文:

Improve a regex to include coupled-dashes matching

问题

我有这个正则表达式:[1-9]([A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d){2}[A-Z&&[^SLOIBZ]]{2}\d{2},它来自这个网站,并且按照这里描述的方式工作。

问题是这个正则表达式不支持破折号,我想相应地修改它。目前,这个正则表达式匹配"1EG4TE5MK72"但不匹配"1EG4-TE5-MK72"。

我改进了原始正则表达式为:[1-9]([A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\\d(-)?){2}[A-Z&&[^SLOIBZ]]{2}\\d{2}(添加了(-)?),现在两个字符串都被匹配。但是"1EG4TE5-MK72"也被匹配,而它应该是无效的(不匹配)。

换句话说:只允许0个或2个破折号。

可以有人帮助我吗?

英文:

I have this regex: [1-9]([A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d){2}[A-Z&&[^SLOIBZ]]{2}\d{2}, which I got from this site and that works as described here.

The problem is that this regex does not support dashes, and I want to modify it accordingly. As per now, this regex matches "1EG4TE5MK72" but not "1EG4-TE5-MK72"

I improved the original regex to: [1-9]([A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\\d(-)?){2}[A-Z&&[^SLOIBZ]]{2}\\d{2} (added (-)?) and now both strings are matched. But also "1EG4TE5-MK72" gets matched, whereas it should be invalid (not matched).

In other words: only 0 or 2 dashes are allowed.

Can someone help me with that?

答案1

得分: 2

你可以展开它,并使用回溯引用来检查与之前存在的破折号的对应关系(或没有破折号)。

[1-9][A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d(-?)[A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d[A-Z&&[^SLOIBZ]]{2}\d{2}

正则表达式解释

  • [1-9]:范围在1-9的数字。
  • [A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d:你的前三个字符。
  • (-?):可选的破折号捕获组。
  • [A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d:你的中间三个字符。
  • \1:回溯引用到捕获组。
  • [A-Z&&[^SLOIBZ]]{2}\d{2}:你的最后三个字符。

查看演示此处

英文:

You can unroll it, and use a backreference to check correspondence with previous existence of dash (or not).

[1-9][A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d(-?)[A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d[A-Z&&[^SLOIBZ]]{2}\d{2}

Regex Explanation:

  • [1-9]: number in range 1-9
  • [A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d: your first three characters
  • (-?): optional dash capturing group
  • [A-Z&&[^SLOIBZ]][0-9A-Z&&[^SLOIBZ]]\d: your middle three characters
  • \1: backreference to the capturing group
  • [A-Z&&[^SLOIBZ]]{2}\d{2}: your last three characters

Check the demo here.

答案2

得分: 0

以下是翻译好的内容:

如果你不在寻找正则表达式模式,你可以尝试以下方法。

这遵循了在提供的PDF文件中指定的模式。

此外,还评估了以下条件。

MBI有11个字符。
MBI的第2、5、8和9个字符始终是字母。
字符1、4、7、10和11始终是数字。
第3和第6个字符是字母或数字。
在MBI中我们不使用破折号。

我不确定如何处理小写参数,我选择了要求大写。

破折号分隔符是可选的,否则必须包含2个,并且位置可以任意。

我没有进行详尽的测试,不过"1EG4-TE5-MK73"将返回true,而"ABCD-EFG-HIJK"将返回false。

这个评估的逻辑是按照过程方式进行的,如果不满足过渡条件,将会早早返回false,最终导致返回true。

英文:

If you're not looking for a regular-expression pattern, you can try the following method.

This follows the pattern, as specified in the PDF file provided.

Pos.    1   2   3   4   5   6   7   8   9  10  11
Type    C   A  AN   N   A  AN   N   A   A   N   N

C  – Numeric 1 thru 9
N  – Numeric 0 thru 9 
AN – Either A or N 
A  – Alphabetic Character (A...Z); Excluding (S, L, O, I, B, Z)

Additionally, the following conditions are evaluated.

- The MBI has 11 characters.
- The MBI’s 2nd, 5th, 8th, and 9th characters are always letters.
- Characters 1, 4, 7, 10, and 11 are always numbers.
- The 3rd and 6th characters are letters or numbers.
- We don’t use dashes in the MBI.

I was unsure how to handle a lowercase argument—I settled with an uppercase requirement.

The dash separator is optional, otherwise it must contain 2, and position can be arbitrary.

I did not run any extensive tests, although "1EG4-TE5-MK73" will return true, and "ABCD-EFG-HIJK" will return false.

The logic of this evaluation works in a procedural manner—there will be an early return of false if a transition is not met—ultimately leading to a return of true.

/** @param string case-sensitive; A-Z */
boolean match(String string) {
    if (string.length() < 11 || string.length() > 13) return false;
    if (string.contains("-")) {
        if (string.split("-").length != 3) return false;
        string = string.replace("-", "");
    }
    String value;
    int position = 1;
    for (char character : string.toCharArray()) {
        value = String.valueOf(character);
        switch (position) {
            case 2, 5, 8, 9 -> {
                if (!value.matches("[A-Z&&[^SLOIBZ]]")) return false;
            }
            case 1, 4, 7, 10, 11 -> {
                if (position == 1 && !value.matches("[1-9]")) return false;
                else if (!value.matches("\\d")) return false;
            }
            case 3, 6 -> {
                if (!value.matches("[A-Z\\d&&[^SLOIBZ]]")) return false;
            }
        }
        position++;
    }
    return true;
}

huangapple
  • 本文由 发表于 2023年6月1日 00:15:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76375497.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定