英文:
Is there any parsing when reading 'clean' text file using Scanner?
问题
我知道:
解析是将某种数据转换为另一种数据的过程。
但后来我也遇到了Scanner
和BufferedReader
之间的这个区别:
BufferedReader
比Scanner
更快,因为BufferedReader
不需要解析数据。
所以我的问题是,如果我只是在读取文本文件(普通字符),并且没有进行任何解析,那么使用Scanner
为什么比使用BufferedReader
更慢?是否有我不知道的任何解析?或者从以下代码的角度来看,由于解析,这里的Scanner
比使用BufferedReader
更慢?
// 1
BufferedReader bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
System.out.println(bufferedReader.readLine());
// 2
Scanner scanner = new Scanner(new FileReader("xanadu.txt"));
scanner.useDelimiter("\n");
System.out.println(scanner.next());
我不理解为什么Scanner
由于解析而更慢,当我在技术上并未对任何数据进行解析。
英文:
I know that:
> Parsing is the process of turning some kind of data into another kind
> of data.
But then I also came across this difference between Scanner
and BufferedReader
:
> BufferedReader is faster than Scanner because BufferedReader does not
> need to parse the data.
So my question is how is using Scanner
slower than using BufferedReader
if I am reading just text file (plain characters) and I am not doing any parsing? Is there any parsing I am not aware of?
Or from following code perspective, how is here Scanner
slower because of parsing than using BufferedReader
?
//1
BufferedReader bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
System.out.println(bufferedReader.readLine());
//2
Scanner scanner = new Scanner(new FileReader("xanadu.txt"));
scanner.useDelimiter("\n");
System.out.println(scanner.next());
I don't understand quote how Scanner
is slower because of parsing, when I am technically not parsing any data..
答案1
得分: 1
将输入流分成行是一种(非常有限的)解析形式,但正如你所说,BufferedReader
也可以做到这一点。区别(如果有的话)在于,BufferedReader
可以使用高度优化的过程来实现单一用例(将流分成行),而 Scanner
需要能够更加灵活地处理(将流分成由任意字符串或正则表达式分隔的标记)。灵活性几乎总是伴随着一定的代价,尽管你不会在不进行一些基准测试的情况下知道这个代价是多少。(而且这个代价可能非常小,因为可以想象 Scanner
对于特定特殊情况可能有优化的算法,它可以识别出这些情况。)
简而言之,“因为解析”的解释并不是一个很好的解释,解释一个接口为什么比另一个接口慢。但是你解析输入的方式越灵活、越精确,所花费的时间就会越多。
英文:
Dividing an input stream into lines is a (very limited) form of parsing, but as you say BufferedReader
can also do that. The difference, if there is one, will be that BufferedReader
can use a highly-optimised procedure to implement a single use case (divide a stream into lines) while Scanner
needs to be able to be considerably more flexible (divide a stream into tokens delimited by an arbitrary string or regular expression). Flexibility almost always comes at a price, although you won't know what that cost is without doing some benchmarking. (And it may be very small, since it is conceivable that Scanner
has optimised algorithms for particular special cases which it can recognise.)
In short, "because parsing" is not a very good explanation for why one interface is slower than another one. But the more flexibly and precisely you parse an input, the more time it is expected to take.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论