2020年8月6日 16:24:39go评论70阅读模式

英文:

Split String with semicolon token in JAVA

问题

我遇到了一个问题，试图使用分号来分割一个字符串：

字符串是：

dsnSalarie;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4; ;S21.G00.30.008;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4;;;

加粗的分号是一个标记，不应被视为分隔符，所以我尝试将分隔符更改为类似于"<;>"的字符串：

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;>;<;>

使用StringUtils.split或StringTokenizer，即使使用"StringUtils.splitPreserveAllTokens"，我也无法获取到那个分号。

我找到的唯一解决方法是将分号用空格包围起来，然后在分割时删除它：

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> ; <;>

谢谢您的建议。

英文:

I'm having issues trying to split a String whith semicolon :

String is :

dsnSalarie;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4; ;S21.G00.30.008;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4;;;

The bolted semicolon is a token and must not be considered as a delimiter, so I've tried to change the delimite for a String like "<;>" :

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;>;<;>

With StringUtils.split or with StringTokenizer I can't get that semicolon, even when using "StringUtils.splitPreserveAllTokens"

The only work around that i found is by surrounder the semicolon whith space, and them trim it when splited :

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> ; <;>

Thanks for your ideas.

答案1

得分: 1

我不太确定我理解了，但以下的代码：

public class Test {
public static void main(String[] args) {
    String test=&quot;dsnSalarie&lt;;&gt;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4&lt;;&gt; &lt;;&gt;S21.G00.30.008&lt;;&gt;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4&lt;;&gt;;&lt;;&gt;&quot;;
    String[] split = test.split(&quot;&lt;;&gt;&quot;);
    for (String string : split) {
        System.out.println(string);
    }
}
}

产生以下输出：

dsnSalarie
e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
 
S21.G00.30.008
e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
;

分词器无法区分相同字符，比如分号。如果分号有语义附加，您需要使用像ANTLR这样的适当解析器来构建您的语言，可以从标记中推断出更高级别的信息。

英文:

I am not quite sure I understand, but the following code:

public class Test {
public static void main(String[] args) {
    String test=&quot;dsnSalarie&lt;;&gt;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4&lt;;&gt; &lt;;&gt;S21.G00.30.008&lt;;&gt;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4&lt;;&gt;;&lt;;&gt;&quot;;
    String[] split = test.split(&quot;&lt;;&gt;&quot;);
    for (String string : split) {
        System.out.println(string);
    }
}
}

Yields

dsnSalarie
e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
 
S21.G00.30.008
e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
;

A tokenizer is not able to differentiate between same characters, like the semicolon. If there is a semantic attached to the ; you need a proper parser like ANTLR to formulate your language which can infer higher order from the tokens.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Java中使用分号标记拆分字符串

问题

答案1

如何分析大小约为 35-40GB 的大型堆转储文件。

我想要覆盖SearchResultsPortlet.java的一些方法。 Liferay/dxp-7.4.13-u76

春季启动在Application.java包中的错误

如何在Java Web项目的context.xml中创建Mongo数据库连接？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论