在Java中,实际上如何检测文件结束?

huangapple go评论73阅读模式
英文:

How is actually End of File detected in java?

问题

我从未考虑过这个。

但如果你读取一个文件,你可以使用类似这样的代码。

FileReader fileReader = new FileReader("c:\\data\\input-text.txt");

int data = fileReader.read();
while(data != -1) {
  data = fileReader.read();
}

但是文件如何被识别为结束呢?这是因为操作系统知道文件的大小吗?还是有一个特殊字符?我认为 Java 会调用操作系统中的一些 C/C++ 函数,这些函数会返回 -1,因此 Java 知道文件已经结束。但操作系统是如何知道文件结尾已经到达的呢?使用了哪个特殊字符呢?

英文:

I never thought about it.

But if you read a file you can use for example this code.

FileReader fileReader = new FileReader("c:\\data\\input-text.txt");

int data = fileReader.read();
while(data != -1) {
  data = fileReader.read();
}

But how is actually recognised that the file ends. Is this because operating system know size of the file. Or is there a special character . I think java will call some C/C++ function from operating system and this function will return -1 , so java knows end of file is reached. But how does operating system know that file end is reached. Which special character is used for this.

答案1

得分: 2

> Java中实际上如何检测文件结尾?

Java并不会检测它。操作系统会进行检测。

文件结尾的含义取决于您正在读取的“文件”的性质。

  • 如果文件是文件系统中的普通文件,则操作系统知道或可以找出实际文件大小。这是文件的元数据的一部分。

  • 如果文件是Socket流,则文件结尾意味着所有可用数据都已被使用,操作系统知道不会再有任何数据。通常,套接字已关闭或半关闭。

  • 如果文件是管道,则文件结尾意味着管道的另一端已关闭,将不会再有更多数据。

  • 如果文件是Linux/UNIX设备文件,则精确的文件结尾含义将取决于设备。例如,如果设备是Linux/UNIX上的“tty”设备,则可能意味着:

    • 连接到串行线路的调制解调器已断开
    • tty处于“cooked”模式,并接收到表示EOF的字符
    • 还可能是其他情况。

命令shell通常会提供一种表示“文件结尾”的方式。根据实现,它可以自行实现此功能,也可以在设备驱动程序级别实现。在任何情况下,Java都不涉及识别过程。

> 我认为Java会从操作系统调用一些C/C++函数,该函数将返回-1,因此Java会知道已到达文件结尾。

在Linux / UNIX / MacOS上,Java运行时会调用read(fd, buffer, count)本机库方法。如果fd位于文件末尾位置,它将返回-1。

英文:

> How is actually End of File detected in java?

Java doesn't detect it. The operating system does.

The meaning of end-of-file depends on the nature of the "file" that you are reading.

  • If the file is a regular file in a file system, then the operating system knows or can find out what the actual file size is. It is part of the file's metadata.

  • If the file is a Socket stream, then end-of-file means that all available data has been consumed, and the OS knows that there cannot be any more. Typically, the socket has been closed or half closed.

  • If the file is a Pipe, then end-of-file means that the other end of the Pipe has closed it, and there will be no maore data.

  • If the file is a Linux/UNIX device file, then the precise end-of-file meaning will be device dependent. For example, if the device is a "tty" device on Linux/UNIX, it could mean:

    • the modem attached to the serial line has dropped out
    • the tty was in "cooked" mode and received the character that denotes EOF
    • and possibly other things.

It is common for a command shell to provide a way to signal an "end of file". Depending on the implementation, it may implement this itself, or it may be implemented at the device driver level. In either case, Java is not involved in the recognition.

> I think java will call some C/C++ function from operating system and this function will return -1 , so java knows end of file is reached.

On Linux / UNIX / MacOS, the Java runtime calls the read(fd, buffer, count) native library method. That will return -1 if the fd is at the end-of-file position.

答案2

得分: -1

我认为像ext和NTFS这样的大多数流行文件系统使用分隔符或特殊字符来标记数据结尾的可能性非常小。这是因为文件通常必须存储二进制信息,而不仅仅是文本数据,如果分隔符存在于其数据中,可能会轻易地混淆操作系统。在Linux中,虚拟文件系统层(VFS)将这些细节转移到实现本身,并且它们中的大多数为文件系统中驻留的每个文件构建一个唯一的iNode(类似于元数据)。iNode倾向于包含有关存储数据的块以及文件的确切大小等信息。当你拥有这些信息时,检测EOF变得非常简单。

英文:

I see the chances of most popular file systems like ext and NTFS using a delimiter/ special char to mark the end of data as very slim. This is because files often have to store binary information too rather than text data and if the delimiter is present within its data, it can easily confuse the OS. In Linux, VFS (Virtual Filesystem Layer) offloads these details to implementations themselves and most of them construct a unique iNode (sort of like metadata) for every file that's resident in the filesystem. iNodes tend to have information on the blocks where the data is stored and also the exact size of the file among other things. Detecting EOF becomes is trivial when you have those.

huangapple
  • 本文由 发表于 2020年10月4日 17:51:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/64193182.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定