通过条件拆分包含换行的字符串

huangapple go评论74阅读模式
英文:

Splitting string by new line with a condition

问题

以下是翻译好的部分:

我正在尝试仅在字符串不在我的“动作块”中时才通过\n拆分字符串。
这里有一个文本示例:message\n [testing](hover: actions!\nnew line!) more\nmessage。我希望只在\n不在[](this \n should be ignored)内部时拆分它。我为此制作了一个正则表达式,您可以在这里查看:https://regex101.com/r/RpaQ2h/1/。在示例中,它似乎工作正常,因此我在Java中进行了实现:

final List<String> lines = new ArrayList<>();
final Matcher matcher = NEW_LINE_ACTION.matcher(message);

String rest = message;
int start = 0;
while (matcher.find()) {
    if (matcher.group("action") != null) continue;

    final String before = message.substring(start, matcher.start());
    if (!before.isEmpty()) lines.add(before.trim());

    start = matcher.end();
    rest = message.substring(start);
}

if (!rest.isEmpty()) lines.add(rest.trim());

return lines;

这应该忽略任何在上述模式内部的\n,但是它从不匹配“action”组,似乎当它添加到Java中并且存在\n时,它从不匹配。我有点困惑,不明白为什么,因为它在regex101上运行得非常完美。

英文:

I am trying to split a String by \n only when it's not in my "action block".
Here is an example of a text message\n [testing](hover: actions!\nnew line!) more\nmessage I want to split when ever the \n is not inside the [](this \n should be ignored), I made a regex for it that you can see here https://regex101.com/r/RpaQ2h/1/ in the example it seems like it's working correctly so I followed up with an implementation in Java:

final List&lt;String&gt; lines = new ArrayList&lt;&gt;();
final Matcher matcher = NEW_LINE_ACTION.matcher(message);

String rest = message;
int start = 0;
while (matcher.find()) {
    if (matcher.group(&quot;action&quot;) != null) continue;

    final String before = message.substring(start, matcher.start());
    if (!before.isEmpty()) lines.add(before.trim());

    start = matcher.end();
    rest = message.substring(start);
}

if (!rest.isEmpty()) lines.add(rest.trim());

return lines;

This should ignore any \n if they are inside the pattern showed above, however it never matches the "action" group, seems like when it is added to java and a \n is present it never matches it. I am a bit confused as to why, since it worked perfectly on the regex101.

答案1

得分: 2

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {

    public static void main(String[] args) {

        final String regex = "(?<action>\\[[^\\]]*\\]\\([^)]*\\))|(?<break>\\\\n)";
        final String string = "message\\n [testing test](hover: actions!\\nnew line!) more\\nmessage";

        final Pattern pattern = Pattern.compile(regex);
        final Matcher matcher = pattern.matcher(string);

        final String result = matcher.replaceAll("$1");

        System.out.println(result);

    }

}
英文:

Instead of checking whether the group is action, you can simply use regex replacement with the group $1 (the first capture group).

I also changed your regex to (?&lt;action&gt;\[[^\]]*]\([^)]*\))|(?&lt;break&gt;\\n) as [^\]]* doesn't backtrack (.*? backtracks and causes more steps). I did the same with [^)]*.

See code working here

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {

	public static void main(String[] args) {

		final String regex = &quot;(?&lt;action&gt;\\[[^\\]]*\\]\\([^)]*\\))|(?&lt;break&gt;\\\\n)&quot;;
		final String string = &quot;message\\n [testing test](hover: actions!\\nnew line!) more\\nmessage&quot;;

		final Pattern pattern = Pattern.compile(regex);
		final Matcher matcher = pattern.matcher(string);

		final String result = matcher.replaceAll(&quot;$1&quot;);

		System.out.println(result);

	}

}

huangapple
  • 本文由 发表于 2020年8月21日 01:03:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/63509942.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定