Linux read调用在文件的一部分损坏时的行为

huangapple go评论72阅读模式
英文:

Linux read call behavior when a segment of the file is corrupted

问题

操作系统:Red Hat 8.X
文件系统:EXT4、XFS
存储类型:SSD、HDD

数据损坏:这里指的是导致写入的数据无法按照原样检索的活动。例如,磁盘设备级别的损坏。

Linux read 调用签名是 ssize_t read(int fd, void buf[.count], size_t count);
假设由 fd 引用的文件具有损坏的段(+未损坏的段)。如果读取请求经过一个或多个损坏的段(假设段是 A(正常)--B(损坏)--C(正常)--D(损坏)--E(正常),且 fd 的文件位置设置在 A 的开始之前,"count" 足够大,以包含所有 A -> E 段),

  1. 是否存在 read 的返回值大于零的可能性?(并且 buf 包含数据)?
    如果是的话,
    1.1. buf 中会包含什么数据?它会包含来自损坏的段 B 和 D 的任何数据吗?read 的返回值可能是什么?

    1.2 这种情况发生的概率是多少?什么因素可能增加这种情况发生的概率?例如重新启动?

  2. fstat 返回的文件大小是否包括从损坏的段中计算的字节数?

目的:我正在尝试决定(在上述操作系统和文件系统条件下),是否需要在已写入的(二进制)数据中添加一个“应用程序级别的计算校验和”,并在读取同一文件时,如果 read 返回成功(即返回值 > 0),则在将数据视为有效之前验证(应用程序级别编写的)校验和。此外,我不担心有人修改此处的写入数据。只关心系统活动可能引发的问题,例如机器重新启动。

英文:

Context :
OS : Red hat 8.X
File systems : EXT4, XFS
Storage Types : SSD, HDD

Corruption : Meant here is an activity that result in written data cannot be retrieved as it was written. .e.g. Disk Device level corruption.

Linux read call signature is ssize_t read(int fd, void buf[.count], size_t count);.
Say the file referred by fd, has corrupted segments (+ NOT corrupted segments). If the read request goes through one or more corrupted segments(assume segments are A(OK)--B(corrupted)--C(OK)--D(corrupted)--E(OK) and fd's file position is set before the beginning of A and "count" is large enough to contain all A -> E segments),

  1. Is there a possibility of read's return value to be larger than ZERO ? (and buf to contain data) ?
    If so,
    1.1. What would be contained in buf ? will it contain any data from corrupted segments B and D ? What could be the return value of read' ?

    1.2 What are probability of this happening ? What factors could increase the probability of this happening ? e.g. re-boot ?

  2. Would the file size returned by fstat count any bytes from corrupted segments ?

Purpose : I am trying to decide(under above given OS, File system conditions), if I NEED to add a "application level calculated checksum" along with written(binary) data and when reading the same file if read returns success(i.e. return value > 0), validate the (app level written)checksum before concluding data as valid.
Also I am NOT worried about some intruder modifying the written data here. Only worried about things that can happen from system activity. e.g. machine re-boot

答案1

得分: 1

如果A可以读取,内核将返回A的长度,读取的部分将成功。这将被称为短读取。一旦这种情况发生,如果您再次调用读取并且无法读取B,您将收到EIO错误。这可能是网络文件系统的问题,坏块,文件系统错误,或任何阻止数据被读取的问题。

一旦读取B失败,它将继续失败,因为文件偏移量没有超越它。如果您使用pread读取未受影响的部分,或者如果您使用lseek到未受影响的部分,您将能够继续读取,直到达到受影响的部分。

这通常是标准的Unix行为,也应该符合任何POSIX系统的预期行为。在某些情况下,失败时的错误代码可能会在某些系统上有所不同(例如,操作系统可能会自动将文件系统重新挂载为只读,并在这种情况下返回其他错误代码),但通常情况下,会读取所有可以有效读取的数据,然后如果无法进一步进行,就会出现错误。

英文:

If A can be read, the kernel will return the length of A, and that portion of the read will be successful. This would be known as a short read. Once that happens, if you make another call to read and B cannot be read, you will get an EIO error. That could be a problem with a network file system, a bad block, a file system error, or anything else that prevents the data from being read.

Once the call to read B fails, it will continue to fail because the file offset is not advanced beyond that. If you use pread to read an unaffected portion, or if you lseek to an unaffected portion, you'll be able to continue to read until you hit an affected portion.

This is generally the standard Unix behaviour, and would be expected of any POSIX system. The error code on failure might differ in some cases on some systems (for example, the OS might automatically remount the file system read only and return some other error code in that case), but generally one reads all the data that can be validly read, and then if further progress is not possible, one gets an error.

huangapple
  • 本文由 发表于 2023年6月26日 18:47:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76555963.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定