如何在Matcher.replaceAll()中进行条件正则表达式替换?

huangapple go评论81阅读模式
英文:

How can one make a conditional regex replacement in Matcher.replaceAll()?

问题

我有一个文件路径,我使用正则表达式将其分成多个组。每个组由 _ 分隔。许多组是可选的。它的结构如下:

Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher(path.toLowerCase());

if (matcher.find())
{
    String type = matcher.replaceAll("$2_$3");
    return type.replaceAll("-","_");
}

假设我有一个路径:hello_world_20190221.xml.json
也假设这是我的正则表达式:

(.*?)_(ce)?_?(.*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?

目前,matcher.replaceAll("$2_$3") 语句将返回 type=_world。我需要替换字符串具有条件逻辑。如果 $2 为空,则只返回 $3。否则,返回 $2_$3replaceAll() 函数是否支持条件正则逻辑?我尝试过,但没有成功。

这段代码的目的是处理路径、正则表达式和用于 replaceAll() 函数的表达式。它返回一个组或一些组的连接。我不能在代码中加入 if 语句之类的内容,但我可以以其他方式更改代码,我认为是可以的。因此,如果使用完全不同的函数能够实现我想要的效果,请提出建议。这段代码有许多不同的情况,因此我不能只计划针对一个情况(比如添加一个 if 语句来检查第二个组是否为空)。所有条件逻辑必须通过这三个字符串之一实现:路径、正则表达式或替换字符串。我可以完全更改这个函数,但不能通过源代码明确地检查是否满足某种条件。

我的表达清楚吗?

英文:

I have a file path that I break into groups (using a regex statement). Each group is separated by a _. Many of the groups are optional groups. It looks like this:

Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher(path.toLowerCase());

if (matcher.find())
{
    String type = matcher.replaceAll("$2_$3");
    return type.replaceAll("-","_");
}

So, let's say I have a path: hello_world_20190221.xml.json.
Let us also say this is my regex:

(.*?)_(ce)?_?(.*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?.

Currently, the matcher.replaceAll("$2_$3") statement will return to me type=_world. I need the replacement string to have condition logic. If $2 is empty, return only $3. Otherwise, return $2_$3. Does the replaceAll() function support condition regex logic? I have not had any success with it.

The purpose of this particular piece of code is to consume a path, a regex, and an expression for the replaceAll() function. It returns a group or some concatenation of groups. I cannot change this code to include if statements and the like. but I can change the code in any other way, I think. So, if using a different function all together will do what I want, please suggest one. This piece of code has many different scenarios running through it, so I cannot plan for just one scenario (such as putting an if-statement to check if group 2 is empty or not). All that conditional logic must come through one of those three strings: path, regex, or the replacementString. I can change the function all together, but cannot make it explicitly (through the source code) check to see if thing condition or that condition is met.

Am I making any sense?

答案1

得分: 2

你可以修改你的正则表达式,将可选的 $2$3 组合在同一个捕获组中,如下所示:

^([^_]*)_((?:ce_)?[^_]*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?

然后你只需要使用:

if (matcher.find()) {
    String type = matcher.replaceAll("$2");
    return type.replaceAll("-","_");
}

基于 Java 代码的解决方案:

如果需要在替换字符串中添加条件逻辑。如果 $2 为空,则只返回 $3。否则,返回 $2_$3

由于你已经有了一个匹配器,你可以这样做:

if (matcher.find()) {
    String type = (matcher.group(2) == null || matcher.group(2).isEmpty() ?
       matcher.replaceAll("$2") :
       matcher.replaceAll("$2_$3"));
    return type.replaceAll("-","_");
}
英文:

You can modify your regex to combine optional $2 and $3 in same captured group like this:

^([^_]*)_((?:ce_)?[^_]*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?    

Then you just need to use:

if (matcher.find()) {
    String type = matcher.replaceAll("$2");
    return type.replaceAll("-","_");
}

RegEx Demo


Java code based solution:

> need the replacement string to have condition logic. If $2 is empty, return only $3. Otherwise, return $2_$3

Since you already have a matcher, you can do this:

if (matcher.find()) {
    String type = (matcher.group(2) == null || matcher.group(2).isEmpty() ?
       matcher.replaceAll("$2") :
       matcher.replaceAll("$2_$3"));
    return type.replaceAll("-","_");
}

答案2

得分: 2

使用简单的 (ce_?)?(.*),并替换为 $2$3。如果 $2 为空,则 $2$3 将变为 $3,否则,将会变成 $2_$3

(.*?)_(ce_?)?(.*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?

hello_ce_world_20190221.xml.json // ce_world
hello_world_20190221.xml.json    // world
英文:

Use simply (ce_?)?(.*), and replace by $2$3. If $2 is empty, $2$3 will be $3, else, it will be like $2_$3 :

(.*?)_(ce_?)?(.*)_([0-9]{8})\.([A-Za-z]{1,20})?\.([A-Za-z]{1,20})?

hello_ce_world_20190221.xml.json // ce_world
hello_world_20190221.xml.json    // world

huangapple
  • 本文由 发表于 2020年10月26日 15:56:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/64533259.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定