我的换行检查器为什么没有提供预期的结果?

huangapple go评论66阅读模式
英文:

Why my linebreak checker is not giving the expected result?

问题

我已经用Java创建了一个模拟图书馆的项目。我有一个名为"Arrumbadas"的方法,它返回那些从未被借出的出版物,并以换行符分隔,还有一个测试,只检查这个结果("Arrumbadas"方法的每一行)是否在末尾有换行符。

我已经实现了"Arrumbadas"方法如下:

(Omitted code)
result = new StringBuilder();
for (omitted code) {
    if (omitted code) {
        result.append(publishing).append("\n");
    }
}
return result.toString();

我已经实现了测试如下:

public static boolean hasLineFeed(String t) {
    Scanner text = new Scanner(t);
    while (text.hasNextLine()) {
        String line = text.nextLine();
        System.out.println(line);
        if (!line.endsWith("\n") && !line.isEmpty()) {
            return false;
        }
    }

    return true;
}

鉴于特定文本:

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
"9780316769174","The Catcher in the Rye","J.R.R. Tolkien","3","0","Coming-of-age fiction"
"9780143124672","Moby-Dick","J.R.R. Tolkien","1","0","Adventure fiction"

测试的输出是:

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
Has linefeed: false

我发现.hasNextLine()不包括分隔符,.split()也不包括我之前用过的分隔符。有没有一种方法可以逐行读取文本而不去掉分隔符呢?

英文:

I have created a project using Java to simulate a Library, I have a method named "Arrumbadas" that return those publishings which have never been lend separated by a linefeed and a test that only checks if each line of this result ("Arrumbadas" method) has a linefeed at the end.

I have implemented "Arrumbadas" method as follow:

(Omitted code)
     result = new StringBuilder();
      for(omitted code){
       if(omitted code){
         result.append(publishing).append(\n)
        }
       }
      return result.toString()

I have implemented the test as follow:

  public static boolean hasLineFeed(String t){
        Scanner text = new Scanner(t);
        while(text.hasNextLine()){
            String line = text.nextLine();
            System.out.println(line);
            if(!line.endsWith("\n") && !line.isEmpty()){
                return false;
            }
        }

        return true;
    }

Given this certain text:

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
"9780316769174","The Catcher in the Rye","J.R.R. Tolkien","3","0","Coming-of-age fiction"
"9780143124672","Moby-Dick","J.R.R. Tolkien","1","0","Adventure fiction"

The output of the test is:

"9780547928227","The Hobbit","J.R.R. Tolkien","2","0","High fantasy"
Has linefeed: false

Figure out that .hasNextLine() doesnt include the delimiter, neither does .split() that I had used it before. Is there a way to read the text by lines without removing the delimiters?

答案1

得分: 1

Scanner 类(至少在 Java 17 中)使用以下正则表达式来匹配行分隔符:

\r\n|[\n\r\u2028\u2029\u0085]

如您所见,这将匹配回车符(CR)、换行符(LF)或回车符后跟换行符(CR LF1)。匹配后,行尾的分隔符将根据 javadoc 被移除。这意味着 readLine() 永远不会返回以 LF(即 '\n')或 CR(即 '\r')结尾的行。

有没有一种方法可以在不移除分隔符的情况下逐行读取文本?

不能使用 Scanner.nextLine()BufferedReader.readLine()。如果您真的需要保留行分隔符,最简单的方法是使用 for 循环从 BufferedReader 中读取字符,识别行分隔符,并自行组装行。然后,您可以将行分隔符包含在行字符串中,或将它们放入不同的变量中。

但是...如果您只是读取 CSV 文件,我不太明白为什么需要保留行分隔符。如果我要做这个任务,我会使用 BufferedReader.readLine 逐行读取,然后使用 ScannerString.split 或其他更复杂的方法来处理逗号和引号。

或者...找到并使用第三方的 CSV 解析库,节省自己实现代码的工作。

1 - ...以及一些您可能从未听说过的其他晦涩编码。

英文:

The Scanner class (Java 17 at least) uses the following regex to match a line separator:

\r\n|[\n\r\u2028\u2029\u0085]

As you can see, this will match a CR, an LF or a CR followed by a LF<sup>1</sup>. After matching, line separator at the end of the line is then removed, as per the javadoc. That means that readLine() will NEVER return a line that ends with an LF (i.e. &#39;\n&#39;) or a CR (i.e. &#39;\r&#39;).

> Is there a way to read the text by lines without removing the delimiters?

Not using Scanner.nextLine() or BufferedReader.readLine(). If you really need to preserve the line separator, the simplest way is to use a for loop to read a characters from BufferedReader, recognize line separators, and assemble lines yourself. You can then include the line separators in the line strings, or put them into a different variable.

HOWEVER ... it is not obvious to me why you need to preserve the line separators if you are simply reading a CSV file. If I was doing this, I would use BufferedReader.readLine to read each line, and then use a Scanner or String.split or something more sophisticated to deal with the commas and quotes.

Or ... find and use a 3rd party CSV parser library and save the effort of implementing the code yourself.


<sup>1 - ... and some other obscure codes that you've probably never heard of.</sup>

huangapple
  • 本文由 发表于 2023年5月21日 23:40:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76300685.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定