英文:
How to split a string (based on a variety of delimiters) but without keeping whitespace?
问题
你可以使用以下的Java代码来修改split表达式和正则表达式,以便将字符串分词但不保留任何空格:
List<String> tokens = Arrays.stream("((!A1)&(B2|C3))".split("((?<=[!&()|])|(?=[!&()|]))"))
                          .filter(token -> !token.trim().isEmpty())
                          .collect(Collectors.toList());
这段代码使用split()函数将字符串分成多个令牌,然后使用filter()函数排除掉所有经过trim()处理后为空的令牌,从而移除了所有空格。最终,你将得到一个不包含空格的令牌列表。
英文:
I have Java strings which are boolean expressions with parentheses, &, |, and ! as operators, and I want to split them into tokens.  For example:
((!A1)&(B2|C3)) should become "(","(","!","A1",")","&","(","B2","|","C3",")",")"
Following this answer I found that I can use Java's String.split() with a regex that includes lookahead and lookbehind clauses:
List<String> tokens = "((!A1)&(B2|C3))".split("((?<=[!&()|])|(?=[!&()|]))")
My only problem is that whitespace will be included in the list of tokens.  For example if I were to write the expression as ( ( !A1 ) & ( B2 | C3 ) ) then my split() would produce at least four strings like " " and there'd be padding around my variables (e.g. " A1 ").
How can I modify this split expression and regex to tokenize the string but not keep any of the witespace?
答案1
得分: 1
你可以使用以下正则表达式来匹配你想要的,而不是使用分割:
[!&()]|[^!&()\h]+
正则表达式详情:
- `[!&()]`: 匹配 `!` 或 `&` 或 `(` 或 `)`
- `|`: 或
- `[^!&()\h]+`: 匹配除了 `!`、`&`、`(`、`)` 和空白字符之外的任何字符
代码:
```java
final String regex = "[!&()]|[^!&()\\h]+";
final String string = "((!A1)&( B2 | C3 ))";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
List<String> result = new ArrayList<>();
while (matcher.find()) {
    result.add(matcher.group(0));
}
System.out.println(result);
<details>
<summary>英文:</summary>
Instead of split you can use this this regex to match what you want:
    [!&()]|[^!&()\h]+
[RegEx Demo][1]
  [1]: http://[!&()]%7C[%5E!&()%5Ch]+
**RegEx Details:**
- `[!&()]`: Match `!` or `&` or `(` or `)`
- `|`: OR
- `[^!&()\h]+`: Match any characters that is NOT `!`, `&`, `(`, `)` and a whitespace
**Code:**
    final String regex = "[!&()]|[^!&()\\h]+";
    final String string = "((!A1)&( B2 | C3 ))";
    
    final Pattern pattern = Pattern.compile(regex);
    final Matcher matcher = pattern.matcher(string);
    
    List<String> result = new ArrayList<>();
    while (matcher.find()) {
        result.add(matcher.group(0));
    }
    System.out.println(result);
</details>
				通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论