在使用Scanner读取“clean”文本文件时,是否进行任何解析?

huangapple go评论73阅读模式
英文:

Is there any parsing when reading 'clean' text file using Scanner?

问题

我知道:

解析是将某种数据转换为另一种数据的过程。

但后来我也遇到了ScannerBufferedReader之间的这个区别:

BufferedReaderScanner更快,因为BufferedReader不需要解析数据。

所以我的问题是,如果我只是在读取文本文件(普通字符),并且没有进行任何解析,那么使用Scanner为什么比使用BufferedReader更慢?是否有我不知道的任何解析?或者从以下代码的角度来看,由于解析,这里的Scanner比使用BufferedReader更慢?

// 1
BufferedReader bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
System.out.println(bufferedReader.readLine());

// 2
Scanner scanner = new Scanner(new FileReader("xanadu.txt"));
scanner.useDelimiter("\n");
System.out.println(scanner.next());

我不理解为什么Scanner由于解析而更慢,当我在技术上并未对任何数据进行解析。

英文:

I know that:

> Parsing is the process of turning some kind of data into another kind
> of data.

But then I also came across this difference between Scanner and BufferedReader:

> BufferedReader is faster than Scanner because BufferedReader does not
> need to parse the data.

So my question is how is using Scanner slower than using BufferedReader if I am reading just text file (plain characters) and I am not doing any parsing? Is there any parsing I am not aware of?

Or from following code perspective, how is here Scanner slower because of parsing than using BufferedReader?

//1
BufferedReader bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
System.out.println(bufferedReader.readLine());
    
//2
Scanner scanner = new Scanner(new FileReader("xanadu.txt"));
scanner.useDelimiter("\n");
System.out.println(scanner.next());

I don't understand quote how Scanner is slower because of parsing, when I am technically not parsing any data..

答案1

得分: 1

将输入流分成行是一种(非常有限的)解析形式,但正如你所说,BufferedReader 也可以做到这一点。区别(如果有的话)在于,BufferedReader 可以使用高度优化的过程来实现单一用例(将流分成行),而 Scanner 需要能够更加灵活地处理(将流分成由任意字符串或正则表达式分隔的标记)。灵活性几乎总是伴随着一定的代价,尽管你不会在不进行一些基准测试的情况下知道这个代价是多少。(而且这个代价可能非常小,因为可以想象 Scanner 对于特定特殊情况可能有优化的算法,它可以识别出这些情况。)

简而言之,“因为解析”的解释并不是一个很好的解释,解释一个接口为什么比另一个接口慢。但是你解析输入的方式越灵活、越精确,所花费的时间就会越多。

英文:

Dividing an input stream into lines is a (very limited) form of parsing, but as you say BufferedReader can also do that. The difference, if there is one, will be that BufferedReader can use a highly-optimised procedure to implement a single use case (divide a stream into lines) while Scanner needs to be able to be considerably more flexible (divide a stream into tokens delimited by an arbitrary string or regular expression). Flexibility almost always comes at a price, although you won't know what that cost is without doing some benchmarking. (And it may be very small, since it is conceivable that Scanner has optimised algorithms for particular special cases which it can recognise.)

In short, "because parsing" is not a very good explanation for why one interface is slower than another one. But the more flexibly and precisely you parse an input, the more time it is expected to take.

huangapple
  • 本文由 发表于 2020年9月6日 21:47:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/63764851.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定