2023年5月21日 23:40:24go评论90阅读模式

英文:

Why my linebreak checker is not giving the expected result?

问题

我已经用Java创建了一个模拟图书馆的项目。我有一个名为"Arrumbadas"的方法，它返回那些从未被借出的出版物，并以换行符分隔，还有一个测试，只检查这个结果（"Arrumbadas"方法的每一行）是否在末尾有换行符。

我已经实现了"Arrumbadas"方法如下：

(Omitted code)
result = new StringBuilder();
for (omitted code) {
    if (omitted code) {
        result.append(publishing).append("\n");
    }
}
return result.toString();

我已经实现了测试如下：

public static boolean hasLineFeed(String t) {
    Scanner text = new Scanner(t);
    while (text.hasNextLine()) {
        String line = text.nextLine();
        System.out.println(line);
        if (!line.endsWith("\n") && !line.isEmpty()) {
            return false;
        }
    }
    return true;
}

鉴于特定文本：

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
"9780316769174","The Catcher in the Rye","J.R.R. Tolkien","3","0","Coming-of-age fiction"
"9780143124672","Moby-Dick","J.R.R. Tolkien","1","0","Adventure fiction"

测试的输出是：

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
Has linefeed: false

我发现.hasNextLine()不包括分隔符，.split()也不包括我之前用过的分隔符。有没有一种方法可以逐行读取文本而不去掉分隔符呢？

英文:

I have created a project using Java to simulate a Library, I have a method named "Arrumbadas" that return those publishings which have never been lend separated by a linefeed and a test that only checks if each line of this result ("Arrumbadas" method) has a linefeed at the end.

I have implemented "Arrumbadas" method as follow:

(Omitted code)
     result = new StringBuilder();
      for(omitted code){
       if(omitted code){
         result.append(publishing).append(\n)
        }
       }
      return result.toString()

I have implemented the test as follow:

  public static boolean hasLineFeed(String t){
        Scanner text = new Scanner(t);
        while(text.hasNextLine()){
            String line = text.nextLine();
            System.out.println(line);
            if(!line.endsWith(&quot;\n&quot;) &amp;&amp; !line.isEmpty()){
                return false;
            }
        }
        return true;
    }

Given this certain text:

&quot;9780547928227&quot;,&quot;The Hobbit&quot;,&quot;J.R.R. Tolkien&quot;,&quot;2&quot;,&quot;0&quot;,&quot;High fantasy&quot;
&quot;9780316769174&quot;,&quot;The Catcher in the Rye&quot;,&quot;J.R.R. Tolkien&quot;,&quot;3&quot;,&quot;0&quot;,&quot;Coming-of-age fiction&quot;
&quot;9780143124672&quot;,&quot;Moby-Dick&quot;,&quot;J.R.R. Tolkien&quot;,&quot;1&quot;,&quot;0&quot;,&quot;Adventure fiction&quot;

The output of the test is:

&quot;9780547928227&quot;,&quot;The Hobbit&quot;,&quot;J.R.R. Tolkien&quot;,&quot;2&quot;,&quot;0&quot;,&quot;High fantasy&quot;
Has linefeed: false

Figure out that .hasNextLine() doesnt include the delimiter, neither does .split() that I had used it before. Is there a way to read the text by lines without removing the delimiters?

答案1

得分: 1

Scanner 类（至少在 Java 17 中）使用以下正则表达式来匹配行分隔符：

\r\n|[\n\r\u2028\u2029\u0085]

如您所见，这将匹配回车符（CR）、换行符（LF）或回车符后跟换行符（CR LF¹）。匹配后，行尾的分隔符将根据 javadoc 被移除。这意味着 readLine() 永远不会返回以 LF（即 '\n'）或 CR（即 '\r'）结尾的行。

有没有一种方法可以在不移除分隔符的情况下逐行读取文本？

不能使用 Scanner.nextLine() 或 BufferedReader.readLine()。如果您真的需要保留行分隔符，最简单的方法是使用 for 循环从 BufferedReader 中读取字符，识别行分隔符，并自行组装行。然后，您可以将行分隔符包含在行字符串中，或将它们放入不同的变量中。

但是...如果您只是读取 CSV 文件，我不太明白为什么需要保留行分隔符。如果我要做这个任务，我会使用 BufferedReader.readLine 逐行读取，然后使用 Scanner、String.split 或其他更复杂的方法来处理逗号和引号。

或者...找到并使用第三方的 CSV 解析库，节省自己实现代码的工作。

^{1 - ...以及一些您可能从未听说过的其他晦涩编码。}

英文:

The Scanner class (Java 17 at least) uses the following regex to match a line separator:

\r\n|[\n\r\u2028\u2029\u0085]

As you can see, this will match a CR, an LF or a CR followed by a LF<sup>1</sup>. After matching, line separator at the end of the line is then removed, as per the javadoc. That means that readLine() will NEVER return a line that ends with an LF (i.e. '\n') or a CR (i.e. '\r').

> Is there a way to read the text by lines without removing the delimiters?

Not using Scanner.nextLine() or BufferedReader.readLine(). If you really need to preserve the line separator, the simplest way is to use a for loop to read a characters from BufferedReader, recognize line separators, and assemble lines yourself. You can then include the line separators in the line strings, or put them into a different variable.

HOWEVER ... it is not obvious to me why you need to preserve the line separators if you are simply reading a CSV file. If I was doing this, I would use BufferedReader.readLine to read each line, and then use a Scanner or String.split or something more sophisticated to deal with the commas and quotes.

Or ... find and use a 3rd party CSV parser library and save the effort of implementing the code yourself.

<sup>1 - ... and some other obscure codes that you've probably never heard of.</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我的换行检查器为什么没有提供预期的结果？

问题

答案1

Spring – 在控制器中检查路径变量是否为空

从单个Spring JDBC Update中检索多个查询生成的键。

In Java, can I reuse the generics types from an interface parameter to create a different class which also requires generic types? And if so, how?

Kafka Streams API：会话窗口不兼容的类型

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。