如何使用两个分隔符拆分字符串并仅保留其中一个?

huangapple go评论74阅读模式
英文:

How to split a String with two delimiters and keep only one of of them?

问题

我想在标点符号和空白字符处拆分字符串,但保留标点符号。例如

String example = "How are you? I am fine!"

我希望得到的结果是

["How","are","you","?","I","am","fine","!"]

但实际上我得到的是

["how"," ","are"," ","you"," ","?"," ","i"," ","am"," ","fine"," ","!"]。

我使用的是 example.toLowerCase().trim().split("(?<=\\b|[^\\p{L}])");

英文:

I want to split a String in punctuation marks and white spaces, but keep the punctuation marks. E.x

String example = &quot;How are you? I am fine!&quot;

I want to have as a result

[&quot;How&quot;,&quot;are&quot;,&quot;you&quot;,&quot;?&quot;,&quot;I&quot;,&quot;am&quot;,&quot;fine&quot;,&quot;!&quot;]

but instead I get

[&quot;how&quot;,&quot; &quot;,&quot;are&quot;,&quot; &quot;,&quot;you&quot;,&quot; &quot;,&quot;?&quot;,&quot; &quot;,&quot;i&quot;,&quot; &quot;,&quot;am&quot;,&quot; &quot;,&quot;fine&quot;,&quot; &quot;,&quot;!&quot;].

what I used was example.toLowerCase().trim().split(&quot;(?&lt;=\\b|[^\\p{L}])&quot;);

答案1

得分: 2

为什么要使用 toLowerCase()?这已经影响了您期望的结果。还有为什么对整个字符串使用 trim()

使用单个 split 调用可能不太简单。

另一种方法是仅筛选掉不需要的条目:

String example = "How are you? I am fine!";

Pattern pattern = Pattern.compile("\\b");
String[] result = pattern.splitAsStream(example)
    .filter(Predicate.not(String::isBlank))
    .toArray(String[]::new);

System.out.println(Arrays.toString(result));

输出:

[How, are, you, ? , I, am, fine, !]

针对您希望输出 [How,are,you,?,I,am,fine,!] 的评论,只需不要使用 Arrays.toString,而是手动构建字符串。数组不包含任何空格。

System.out.println("[" + String.join(",", result) + "]");
英文:

Why are you doing toLowerCase()? This already messes up your expected result. And why the trim() on the full string?

Doing this with a single split call is probably not too simple.

An alternative would be to just filter out the unwanted entries:

String example = &quot;How are you? I am fine!&quot;;

Pattern pattern = Pattern.compile(&quot;\\b&quot;);
String[] result = pattern.splitAsStream(example)
    .filter(Predicate.not(String::isBlank))
    .toArray(String[]::new);

System.out.println(Arrays.toString(result));

Output:

[How, are, you, ? , I, am, fine, !]

Reacting to your comment of wanting [How,are,you,?,I,am,fine,!] as output; simply dont print with Arrays.toString but build the string yourself manually. The array does not contain any whitespaces.

System.out.println(&quot;[&quot; + String.join(&quot;,&quot;, result) + &quot;]&quot;);

答案2

得分: 1

您可以按照以下方式操作

    import java.util.Arrays;
    
    public class Main {
        public static void main(String[] args) {
            String example = "How are you? I am fine!";
            String[] arr = example.split("\\s+|\\b(?=\\p{Punct})");
            System.out.println(Arrays.toString(arr));
        }
    }

**输出**

    [How, are, you, ?, I, am, fine, !]

**正则表达式解释**

 1. `\\s+` 指定空格
 2. `\\b` 指定单词边界
 3. `(?=\\p{Punct})` 指定标点的正向先行断言
 4. `|` 指定选择(``)
英文:

You can do it as follows:

import java.util.Arrays;

public class Main {
	public static void main(String[] args) {
		String example = &quot;How are you? I am fine!&quot;;
		String[] arr = example.split(&quot;\\s+|\\b(?=\\p{Punct})&quot;);
		System.out.println(Arrays.toString(arr));
	}
}

Output:

[How, are, you, ?, I, am, fine, !]

Explanation of the regex:

  1. \\s+ specifies the space
  2. \\b specifies the word boundary
  3. (?=\\p{Punct}) specifies the positive look ahead for punctuation.
  4. | specifies the alternation (OR)

huangapple
  • 本文由 发表于 2020年9月8日 15:49:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/63789403.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定