为什么在多进程的两个不同进程中对象是相同的?

huangapple go评论65阅读模式
英文:

Why objects are same in 2 different process of multiprocessing?

问题

我有下面的代码:

test.py:

import multiprocessing
import time

class A:
    def __action(self):
        print("another process:")
        print(id(self))

    def run(self):
        print("this process")
        print(id(self))
        p = multiprocessing.Process(target=self.__action, daemon=True)
        p.start()

a = A()
print("main")
print(id(a))
a.run()
time.sleep(3)

它运行如下:

$ python3 test.py
main
140643898766000
this process
140643898766000
another process:
140643898766000

但根据文档

如上所述,在进行并发编程时,尽量避免使用共享状态通常是最好的。当使用多个进程时,尤其如此。

但是,如果您确实需要使用一些共享数据,那么 multiprocessing 提供了一些方法。

共享内存

可以使用 Value 或 Array 将数据存储在共享内存映射中。

因此,只有明确定义Value/Array时,两个进程才能拥有相同的对象。那么,为什么在两个不同的进程中A对象现在具有相同的ID呢?

英文:

I have next code:

test.py:

import multiprocessing
import time

class A:
    def __action(self):
        print("another process:")
        print(id(self))

    def run(self):
        print("this process")
        print(id(self))
        p = multiprocessing.Process(target=self.__action, daemon=True)
        p.start()

a = A()
print("main")
print(id(a))
a.run()
time.sleep(3)

It runs as next:

$ python3 test.py
main
140643898766000
this process
140643898766000
another process:
140643898766000

But from doc next:

> As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.
>
> However, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.
>
> Shared memory
>
> Data can be stored in a shared memory map using Value or Array.

So it looks only with explicitly define Value/Array can the 2 processes have the same object. Then, why the A object in 2 different processes now have same id?

答案1

得分: 4

你显然正在运行某种类似Linux的操作系统,其中多进程使用fork()来创建工作进程。在fork()下,新进程会得到父进程地址空间的写时复制克隆。因此,是的,在fork()之后从父进程继承的对象将具有相同的id(),但它们仍然是完全不同的对象。地址是虚拟的。主进程中的self存在于与工作进程中的self不同的物理内存中。它们在每个进程中具有相同的虚拟地址只是操作系统实现fork()的结果。在一个进程中对对象所做的更改在另一个进程中是不可见的。

在没有fork()的系统上(主要是Windows),或者如果你告诉多进程使用"spawn"而不是"fork",那么self保留相同虚拟地址的可能性非常小(尽管仍然可能)。

英文:

You're apparently running under some Linux-like OS, where multiprocessing uses fork() to create a worker process. Under fork(), the new process gets a copy-on-write clone of the parent process's address space. So, yes, objects inherited from the parent process after a fork() will have the same id(), but they're nevertheless entirely distinct objects. Addresses are virtual. The self in the main process lives in different physical memory than the self in the worker process. That they have the same virtual address in each process is just a consequence of how the OS implements fork(). Changes made to the object in one process will not be seen in the other.

On a system without fork() (chiefly Windows), or if you tell multiprocessing to use "spawn" instead of "fork", then it's very unlikely (although still possible) that self will retain the same virtual address.

huangapple
  • 本文由 发表于 2023年6月16日 10:47:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76486667.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定