英文:
Sending an image with an ID across processes in Python, Windows
问题
我有一个涉及多个Python进程的项目,它们在同一台计算机上运行。
Python进程A是中心节点,它接收来自传感器的图像和数据。
Python进程A将图像发送到进程B进行图像处理/特征检测,然后进程B将一些信息发送回进程A。系统比这更复杂,这就是为什么它首先被分离的原因。
我想要实现的一个想法是给图像打上一个整数标签,比如x
。这样,进程B可以说:“嗨,进程A,这是图像x
的数据”。
问题是,如何优雅而快速地跨进程发送图像和这个整数?
想法:
- 使用MQTT或ZeroMQ进行通信。将图像转换为字符串,将其放入带有ID的字典中,然后发送。我认为这些方法会很慢,因为我们需要序列化大量图像并通过这些庞大的协议发送大量数据。
- 使用共享内存来共享图像。我已经做过原型,它非常快速地发送图像,但编码整数不够优雅。也许我可以将整数编码成角落中几个像素的RGB值,但这感觉有点粗糙。
- 比起字符串序列化,pickling可能更好,因为它是字节而不是字符串,速度更快。多进程的管道和队列似乎适用于此,但发送方和接收方似乎需要在同一Python线程中启动。
- 在每个进程中为每个图像找到一个唯一的哈希值。我目前更倾向于这个选项,因为它非常容易理解,但可能不是最快的。
我认识到这是一个架构问题,而这种问题通常不被看好,但我觉得一定有一个更加优雅的解决方案,我认为这是一个有趣的设计问题。
英文:
I have a project with multiple Python processes on the same computer.
Python process A is the central node, and it receives images and data from sensors.
Python process A sends images to process B for image processing/feature detection, and process B sends back some information to process A. The system is much more complex than this which is why it's separated in the first place.
One idea that I want to implement is to tag the images with an integer, let's say x
. This way, process B can say "Hi process A, this is the data for image x
".
The problem is, how can I send both images and this integer elegantly and rapidly across processes?
Ideas:
- Use MQTT or ZeroMQ to communicate. Turn image into string, put it in a dict with the id, send it. I think these will be slow because we are serializing lots of images and sending lots of data over these chunky protocols.
- Use Shared memory to share the image. I have prototyped this, it is very fast to send images, but encoding the integer is not elegant. I would maybe encode the integer into the RGB value of a few pixels in the corner, but this feels janky.
- Pickling might be better than string serialization since its bytes vs strings and so much faster? The multiprocessing pipes and queues seem like they are for this, but the sender and receiver seem to need to start in the same Python thread.
- Find a unique hash for each image in every process. I'm leaning towards this option right now, it's very easy to understand but probably not the fastest.
I recognize this is an architecture question and those are frowned upon, but I feel like there has to be a more elegant solution and I think this is an interesting design problem to ask.
答案1
得分: 1
个人而言,我发现Redis非常适合这种情况。将其想象成一个快速的,"内存中"的数据结构服务器,可以提供原子整数、字符串、图像、列表、队列、集合、有序集合、JSON、流、发布/订阅等功能。这里列出了创建/读取对象的命令。
它支持C、C++、Python、Ruby、PHP和bash
命令行的绑定,因此您可以轻松地从Shell中注入和检查测试数据。
它完全分布在网络上。
它处理二进制数据(无需序列化)以及大小可达512MB的键和值。
因此,您可以创建一个包含JPEG/PNG/TIFF或原始二进制数据图像的Redis对象(SET),然后将包含对象名称和id
的JSON推送(LPUSH)到队列中进行处理。
运行尽需要的处理节点,从队列中执行pops
操作(BRPOP),并通过添加更多节点进行扩展。
还有另一个results
队列,工作者将其答案和相应的id
放入其中。
您还可以指定"生存时间",以便数据结构在一定时间后被自动删除。
如果需要,还可以在重启时创建持久性。
如果需要,还可以添加冗余,以进行集群化。
我做了很多示例,请尝试搜索我的用户名、Redis
和image
。希望这里能够有所帮助。
英文:
Personally, I find Redis great for this type of thing. Imagine it as a blazing fast, "in-memory" datastructure server that can serve atomic integers, strings, images, lists, queues, sets, sorted sets, JSON, streams, PUB/SUB ... here are the commands to create/read objects.
It has bindings for C, C++, Python, Ruby, PHP and bash
commandline so you can inject and check test data easily just from the shell.
And it's entirely distributed across networks.
And it handles binary data (no need for serialisation) and keys and values up to 512MB.
So you can create a Redis object containing a JPEG/PNG/TIFF or raw image of binary data (SET), then push (LPUSH) a JSON containing the object name and id
onto a queue for processing.
Run as many processing nodes as you need doing pops
off the queue (BRPOP), and scale by adding more.
Have another results
queue where workers put another JSON with their answers and corresponding id
.
You can also specify a "Time-To-Live" so that datastructures are automagically deleted after a given time.
You can also create persistance across reboots, if you want to.
You can also add redundancy, if you want to cluster it.
I have done many examples, try searching for my username and Redis
and image
. Hopefully, this will work.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论