子进程在fork()之后是否共享stdio流(FILE *)的偏移量(以及与父进程共享)?

huangapple go评论69阅读模式
英文:

Do child processes share stdio streams (FILE *) offsets form (and with) parent after a fork()?

问题

I know that as stated in the fork manual, forked child processes have a copy of the open fd from the parent, so both the parent and child share file offsets. Stdio streams are built on top of fd functions, but they include buffers that are unique for each process, so at the moment of forking the child inherits those FILE*, but do they share the same buffers and the same file offsets? I've searched through the manual and this forum but have found nothing about FILE*, only about file descriptors. Sorry if this question is a duplicate but I haven't been able to find it.

英文:

So I know that as stated in the fork manual, forked child processes have a copy of the open fd from the parent, so both the parent and child share file offsets.

Stdio streams are built on top of fd functions, but they includes buffers that are unique for each process, so at the moment of forking the child inherits those FILE*, but do they share the same buffers and the same file offsets?

I've searched through the manual and this forum but have found nothing about FILE*, only about file descriptors. Sorry if this question is a duplicate but I haven't been able to find it.

答案1

得分: 1

以下是已翻译的内容:

正如在fork()手册中所述,并正如您在您的帖子中所说:

子进程继承了父进程的一组打开文件描述符的副本。子进程中的每个文件描述符都指向与父进程中相应文件描述符相同的打开文件描述符(请参见open(2))。这意味着这两个文件描述符共享打开文件的状态标志,文件偏移和信号驱动的I/O属性(请参见fcntl(2)中F_SETOWN和F_SETSIG的描述)。

FILE是在OS文件描述符数据结构之上的数据结构。它基本上是在内存中的动态分配,它在fork时从父进程复制到子进程。因此,这个数据结构中的任何缓冲数据都会在子进程中复制。它们都可以随时将其刷新到目标文件中。由于父进程和子进程共享文件偏移,这将触发输出上的重复打印。考虑以下程序作为示例:

#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void)
{

  printf("String to be printed on stdout");

  if (fork() == 0) {

    // 子进程
    fflush(stdout);
    
  } else {

    // 父进程
    fflush(stdout);

    wait(NULL);
  }

}

运行它时,我们可以看到两个打印输出,这是由于最初由父进程调用的单个printf()

$ gcc try.c
$ ./a.out 
String to be printed on stdoutString to be printed on stdout$

当然,子进程或父进程的未来打印输出将混合在输出中,因为它们都会移动到相同的底层文件偏移。

注意:C库中stdio子系统的目标是通过使用首先进行I/O的中间缓冲区来最小化系统调用的次数。实际的系统调用将在没有选择时触发(需要将缓冲区刷新到输出文件以追加数据,寻址操作超出了缓冲区当前映射的边界,显式刷新请求...)。因此,如果其中一个进程调用rewind()ftell(),这可能会导致实际的系统调用或不取决于调用进程的缓冲区状态。其他进程只有在执行系统调用时才会受到影响。库调用和等效的系统调用之间没有一对一的对应关系。通常,库调用会触发比系统调用更多(例如,多次fwrite()可能只会触发一个write()系统调用)。

英文:

As stated in the manual of fork() and as you said in your post:

> The child inherits copies of the parent's set of open file
descriptors. Each file descriptor in the child refers to the
same open file description (see open(2)) as the corresponding
file descriptor in the parent. This means that the two file
descriptors share open file status flags, file offset, and
signal-driven I/O attributes (see the description of F_SETOWN
and F_SETSIG in fcntl(2)).

FILE is a data structure on top of the OS file descriptor data structure. It is basically a dynamic allocation in memory which is copied from the father to the child process at fork time. Hence, any buffered data in this data structure will be duplicated in the child process. And both of them may flush it at any time into the target file. As the file offsets are shared between father and child, this will trigger duplicate prints on the output. Consider the following program as example:

#include &lt;sys/types.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/wait.h&gt;

int main(void)
{

  printf(&quot;String to be printed on stdout&quot;);

  if (fork() == 0) {

    // Child
    fflush(stdout);
    
  } else {

    // Father
    fflush(stdout);

    wait(NULL);
  }

}

When running it, we can see two prints resulting from a single printf() originally called from the father:

$ gcc try.c
$ ./a.out 
String to be printed on stdoutString to be printed on stdout$

And of course, any future prints on child or father side will be mixed on the output as both of them will move the same underlying file offset.

NB: The goal of the stdio subsystem in the C library is to minimize the number of system calls by using an intermediate buffer into which the I/O are done first. The actual system calls are triggered when there is no choice (buffer full needs to be flushed to append data into the output file, the seek operation goes out of the bounds currently mapped by the buffer, explicit flush request...). So, if one of the processes calls rewind() or ftell(), this may result into an actual system call or not depending on the state of the buffer in the calling process. The other process will be affected only if a system call is done. There is not a 1/1 correspondence between the library calls and the equivallent system calls. Typically, there will be more library calls than system calls (e.g. multiple fwrite() may trigger only one write() system call).

huangapple
  • 本文由 发表于 2023年4月11日 00:29:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75978851.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定