2020年8月6日 22:32:05go评论53阅读模式

英文:

Java StringTokenizer - Problems with nextToken() usage with substring

问题

我有一个文本文件，我必须迭代遍历并希望将每行的某些元素移到一个 ArrayList 中。文件的每一行的格式如下：number. String number. decimal decimal
由于两个数字在末尾有一个句点（.），我需要将它们读取为字符串，然后使用子字符串删除句点，并转换为原始数据类型（int 或 short）。

示例文件：
294. ABC123 66. .00 .00

如果我尝试这样做，我会得到一个字符串范围错误：（* temp 是一个字符串）

while(fileLine.hasMoreTokens())
{
    oneNumber = Integer.valueOf(fileLine.nextToken().substring(0, 
                          fileLine.nextToken().indexOf('.')));
    twoString = fileLine.nextToken();
    threeNumber = Short.valueOf(fileLine.nextToken().substring(0, 
                          fileLine.nextToken().indexOf('.')));
    temp = fileLine.nextToken();    //处理不需要的属性
    temp = fileLine.nextToken();    //处理不需要的属性
}

我相信发生这种情况的原因是子串中的 nextToken() 在参数中混淆了 StringTokenizer。所以我像这样修复它：

while(fileLine.hasMoreTokens())
{
    temp = fileLine.nextToken();
    oneNumber = Integer.valueOf(temp.substring(0, temp.indexOf('.')));
    twoString = fileLine.nextToken();
    temp = fileLine.nextToken();
    threeNumber= Short.valueOf(temp.substring(0, temp.indexOf('.')));
    temp = fileLine.nextToken();
    temp = fileLine.nextToken();
}

虽然这个方法有效，但感觉有点多余。是否有什么方法可以使这个过程更加简洁，同时保留对 StringTokenizer 的使用？

英文:

I have a text file I must iterate through and want to move certain elements of each line into an ArrayList. Each line of the file is in the format: number. String number. decimal decimal
As the two numbers have a full stop (.) at the end and I need to read these as a String, removed the . using substring and then convert to a primitive data type (int or short).

Example on file:
294. ABC123 66. .00 .00

I get a string range error if I try this: (* temp is a String)

while(fileLine.hasMoreTokens())
{
	oneNumber = Integer.valueOf(fileLine.nextToken().substring(0, 
                          fileLine.nextToken().indexOf(&#39;.&#39;)));
	twoString = fileLine.nextToken();
	threeNumber = Short.valueOf(fileLine.nextToken().substring(0, 
                          fileLine.nextToken().indexOf(&#39;.&#39;)));
	temp = fileLine.nextToken();    //Handle attributes not required
	temp = fileLine.nextToken();    //Handle attributes not required
}

I believe why this is happening is that the nextToken() in the substring's parameters is confusing the StringTokenizer. So I fixed it like this:

				while(fileLine.hasMoreTokens())
				{
					temp = fileLine.nextToken();
					oneNumber = Integer.valueOf(temp.substring(0, temp.indexOf(&#39;.&#39;)));
					twoString = fileLine.nextToken();
					temp = fileLine.nextToken();
					threeNumber= Short.valueOf(temp.substring(0, temp.indexOf(&#39;.&#39;)));
					temp = fileLine.nextToken();
					temp = fileLine.nextToken();
				}

While this works it feels a bit redundant. Is there something I can try to make this cleaner, while retaining use of the StringTokenizer?

答案1

得分: 1

这是.nextToken()的预期行为：它返回令牌并移动到当前令牌之后。当你使用Integer.valueOf(fileLine.nextToken().substring(0, fileLine.nextToken().indexOf('.')))时，你调用了.nextToken()两次，这意味着你在处理两个不同的令牌。这与String#substring的工作方式无关。如果你需要对其执行其他操作，你需要将令牌存储在一个变量中。同样的问题也可以由在应该存储该值时两次使用BufferedReader#readLine引起。

英文:

This is the intended behavior of .nextToken(): it returns the token and moves past the current token. When you use Integer.valueOf(fileLine.nextToken().substring(0, fileLine.nextToken().indexOf('.'))), you are calling .nextToken() twice, which means you are dealing with two distinct tokens. It has nothing to do with how String#substring works. You need to store the token in a variable if you need to perform additional operations on it. This exact same problem can also be caused by using BufferedReader#readLine twice when one should be storing the value.

答案2

得分: 1

Yup. nextToken() 是有状态的，调用它会改变状态，因此在同一行中两次使用它将消耗两个令牌。

你的第二个片段对我来说看起来更容易阅读，所以我不确定问题在哪里。可能你希望你的代码更具可读性。

一个简单的解决方法是创建辅助方法：

while (fileLine.hasMoreTokens()) {
    oneNumber = fetchHeadingNumber(fileLine);
    twoString = fileLine.nextToken();
    threeNumber = fetchHeadingNumber(fileLine);
    fileLine.nextToken(); // 无需赋值
    fileLine.nextToken();
}

使用这个方法：

int fetchHeadingNumber(StringTokenizer t) {
    String token = t.nextToken();
    return Integer.parseInt(token.substring(0, token.indexOf('.')));
}

你甚至可以进一步创建一个表示一行的类，该类具有解析所需的所有代码（我随意取了一些名称；你的代码片段并没有明确表示一行代表什么样的东西）：

@lombok.Value class InventoryItem {
    int warehouse;
    String name;
    int shelf;

    public static InventoryItem read(StringTokenizer tokenizer) {
        int warehouse = num(tokenizer);
        String name = tokenizer.nextToken();
        int shelf = num(tokenizer);
        tokenizer.nextToken();
        tokenizer.nextToken();
        return new InventoryItem(warehouse, name, shelf);
    }
    private static int num(StringTokenizer t) {
        String token = t.nextToken();
        return Integer.parseInt(token.substring(0, token.indexOf('.')));
    }
}

然后，读取一行并检索，例如，存储在哪里的位置就变得更加方便：现在事物都有名称了！

InventoryItem item = InventoryItem.read(fileLine);
System.out.println("This item is in warehouse " + item.getWarehouse());

注意：使用 lombok 的 @Value 来避免在这个答案中添加大量样板代码。

英文:

Yup. nextToken() is stateful, calling it changes things, so using it twice in a single line would consume two tokens.

Your second snippet seems much easier to read to me, so I'm not sure what the problem is. Presumably you want your code to be more readable.

An easy fix is to make helper methods:

while (fileLine.hasMoreTokens()) {
    oneNumber = fetchHeadingNumber(fileLine);
    twoString = fileLine.nextToken();
    threeNumber = fetchHeadingNumber(fileLine);
    fileLine.nextToken(); // no need to assign it.
    fileLine.nextToken();
}

with this method:

int fetchHeadingNumber(StringTokenizer t) {
    String token = t.nextToken();
    return Integer.parseInt(token.substring(0, token.indexOf(&#39;.&#39;)));
}

you can go even further and make a class representing a line, which has all the code needed to parse it (I made up names; your snippet doesn't make clear what kind of thing the line represents):

@lombok.Value class InventoryItem {
    int warehouse;
    String name;
    int shelf;

    public static InventoryItem read(StringTokenizer tokenizer) {
        int warehouse = num(tokenizer);
        String name = tokenizer.nextToken();
        int shelf = num(tokenizer);
        tokenizer.nextToken();
        tokenizer.nextToken();
        return new InventoryItem(warehouse, name, shelf);
    }
    private static int num(StringTokenizer t) {
        String token = t.nextToken();
        return Integer.parseInt(token.substring(0, token.indexOf(&#39;.&#39;)));
    }
}

and then reading a line and retrieving, say, the location where it is stored is so much nicer: Now things actually have names!

InventoryItem item = InventoryItem.read(fileLine);
System.out.println(&quot;This item is in warehouse &quot; + item.getWarehouse());

NB: Uses lombok's @Value to avoid putting a lot of boilerplate in this answer.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Java StringTokenizer – 使用substring时出现的nextToken()问题

问题

答案1

答案2

语法错误在或靠近”$1″处。 JDBC PostgreSQL -> 设置 var = ?

我的哈希映射在 Firebase 数据库中没有被实现。

Spring ReactiveMongoTemplate查询带有嵌套对象的对象

在使用Android Studio编写文件时的渲染问题

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论