英文:
mod_wsgi: Concise comparison of application groups / process groups / processes / threads?
问题
背景
我是一名Python开发者,尝试在其 '守护程序' 模式下使用Apache和mod_wsgi来部署Python Web应用程序。
问题
我已经阅读了很多mod_wsgi文档,但仍然不确定我是否清晰、简明地理解如何在部署应用程序时使用不同的执行上下文(?),包括应用程序组、进程组、进程和线程。文档对它们的每个方面都有详细的说明,但我找不到一个高层次的解释/总结/TLDR,说明如何使用/思考它们,以便不熟悉Web服务器管理的人可以快速理解它们如何相互关联。我希望有一个关于如何使用它们以及它们之间的关系的TLDR。
英文:
Background
I'm a Python dev trying to get comfortable using Apache and mod_wsgi in its 'daemon' mode to deploy Python web apps.
Problem
I've read a lot of the mod_wsgi docs but I'm still not sure I have a clear, concise understanding of how to think about and use the different execution contexts(?) available when deploying an app: application groups, process groups, processes, and threads. The docs go into great detail on each of them but I couldn't find a high-level explanation/summary/TLDR of how to use / think about them, so that someone unfamiliar with webserver administration can quickly grasp how they all fit together. I wish there was a TLDR for how to use each of them and how they relate to each other.
答案1
得分: 0
Sure, here is the translated content:
在阅读了大量文档后,我认为我对如何思考和使用这些不同的执行上下文有了一个简洁的概念,但我不确定我是否犯了错误或遗漏了什么:
它们允许您调整CPU和RAM分配给您的Web应用程序以及在它们之间共享的方式
控制服务器上存在的每种类型的执行上下文数量就像是您可以扭动的不同旋钮,以调整这些资源应该如何分配。
它们以父子关系的层次结构设置
这些结构具有以下父子关系的层次结构:
(从上到下,从父到子):
- 应用程序组
- 进程组
- 进程
- 线程
因此,一个应用程序组下面有一个或多个进程组,一个进程组下面有一个或多个进程,一个进程下面有一个或多个线程。
每个执行上下文的作用
- 应用程序组 - 用于允许多个进程组(Web应用程序)共享Python的“子解释器”,以便它们不必加载自己的Python模块的副本。因此,这基本上是在运行多个进程组(Web应用程序)时节省RAM的一种方式。
- 进程组 - 这些对应于您的不同Web应用程序。每个Web应用程序应该位于其自己的进程组中。如果某个Web应用程序变得更受欢迎并开始获得更多用户,您可以增加进程组为其分配的进程(Web应用程序的实例)的数量。
- 进程和线程:
- 经过长时间的思考后,我认为理解进程和线程最简单的方法是通过类比:
- 想象一下,您经营一家餐厅,您试图根据传入的订单尽快烹制出食物。
- 为处理这些订单,想象一下,您有一定数量的厨师机器人,由通过互联网连接的远程员工控制。
- 想象一下,您可以允许您在一天内拥有的远程员工的数量多于或少于您拥有的机器人体的数量:如果在某一天,您拥有的远程员工少于机器人体的数量,那么一些机器人体将无所作为,而如果在某一天,您拥有的远程员工多于机器人体,那么远程员工将等待轮流使用机器人体。
- 想象一下,您可以设置每个远程员工允许同时分配的最大订单数量(例如,同时烹制汉堡、制作煎饼和炸蛋,而不是一次做一样)。
- 想象一下,人类厨师可能需要等待一些东西完成,以完成一道菜。例如,他们可能需要等待食物完成烹饪,或等待洗碗工将存放在地下室的食材送出,或等待洗碗机完成清洁所需的设备。
- 想象一下,人类控制机器人体非常快速/高效,因此,例如,如果一个人类厨师控制器在几秒钟内没有什么事可做(例如,因为他们正在等待食物完成烹饪),他们可以立即让另一个“在场边等待”的人类控制器换入相同的机器人体,以处理他们自己的订单。
- 想象一下,您可以规定人类厨师应该如何准备每道菜,如果您规定了一种高效的方法,他们会尽快完成每道菜,如果您规定了一种低效的方法,他们可能会浪费很多时间。
- 该类比如何应用于服务器、进程和线程:
- 餐厅类比对应于您的服务器。
- 厨房接收到的每份食品订单对应于您的服务器收到的传入Web(HTTP)请求。
- 从厨房送出的每道完成的菜肴对应于您的服务器对传入请求的响应。
- 远程控制的厨师类比于您的服务器CPU核心。
- 控制机器人体的人类控制器类比于您的服务器进程。
- 每个人类厨师允许同时处理的订单数量对应于您为每个进程分配的线程数量。
- 等待食物烹饪或食材送达对应于您的服务器进程/线程需要等待从数据库或某个外部API传送的数据,然后才能准备其响应。
- 您规定每道菜应该如何准备类比于您的应用程序代码和其他工具(例如缓存),以加速准备响应的过程。
- 类比的要点:
- 您应该拥有多少个进程(人类厨师)取决于您拥有多少个CPU核心(机器人体)以及每个进程(人类厨师)在准备响应(菜肴)时可能会做多少等待。
- 请注意,如果响应(菜肴)的准备涉及一些等待,拥有更多的进程(人类)可能比CPU核
- 您应该拥有多少个进程(人类厨师)取决于您拥有多少个CPU核心(机器人体)以及每个进程(人类厨师)在准备响应(菜肴)时可能会做多少等待。
- 经过长时间的思考后,我认为理解进程和线程最简单的方法是通过类比:
英文:
After reading through a lot of the docs, I think I have an concise picture of how I'm supposed to think about / use these different execution contexts, but I'm not sure if I'm mistaken or missing something:
They let you tweak how CPU and RAM are allocated to and shared between your web apps
Controlling the number of each type of execution context that exists on your server is like a different knob you can twist to tweak how those resources should be allocated.
They're set up in a hierarchy of parent-child relationships
The structures have a hierarchy of parent-child relationships as follows:
(From top-to-bottom, parent-to-child):
- Application groups
- Process groups
- Processes
- Threads
So an application group has one or more process groups "underneath" it, a process group has one or more processes, and a process has one or more threads.
What each does
- Application groups - These are used to allow multiple process groups (web apps) to share a Python "sub-interpreter" so that they don't all have to load their own copies of Python modules. So it's basically a way to save RAM when running multiple process groups (web apps). [1]
- Process groups - These correspond to your different web apps. Each web app should be in its own process group [2]. If a particular web app becomes more popular and starts getting more users, you can increase the number of processes (instances of the web app) that the process group has allocated to it.
- Processes and threads:
- After thinking about it for a long time, I think the easiest way to understand processes and threads is with an analogy:
- Imagine you run a restaurant where you are trying to cook plates of food based on incoming orders as quickly as possible.
- To handle these orders, imagine you have a certain number of chef-robots that are controlled by remote employees who connect via the Internet.
- Imagine that it's allowed for you to have more or fewer remote employees working at one time than you have robot bodies: if you have fewer remote employees than robot bodies on a given day, some of the robot bodies will just sit there doing nothing, whereas if you have more remote employees working on a given day than robot bodies, you'll have remote employees waiting to take turns to use a robot body.
- Imagine that you can set the maximum number of incoming orders each remote employee is allowed to be assigned to work on at the same time (like, cooking a hamburger, making pancakes, and frying eggs at the same time rather than doing them one-at-a-time).
- Imagine that the human chefs can end up needing to wait for various things they need to finish the plate of food. For example, they may need to wait for some food to finish cooking, or wait for a busboy to bring ingredients out of storage in the basement, or wait for the dishwasher to finish cleaning some piece of equipment they need to proceed.
- Imagine that the human controllers can swap in and out of the robot bodies extremely quickly/efficiently, so that, for example, if one human chef-controller doesn't have anything to do for a few seconds (because, for example, they're waiting for their food to finish cooking), they can instantly let another human-controller that was "waiting on the sidelines" swap into the same robot body to work on their own orders.
- Imagine that you dictate how the human chefs should prepare each plate of food, and if you dictate an efficient method they get each plate done as quickly as possible, and if you dictate an inefficient method then they end up wasting a lot of time.
- How the analogy applies to a server, processes, and threads:
- The restaurant is analogous to your server.
- Each incoming food order the kitchen receives is analogous to an incoming web (HTTP) request.
- Each finished plate of food sent out of the kitchen is analogous to your server's response to an incoming request.
- The remote-controlled chefs are analogous to your server's CPU cores.
- The human controllers of the robot bodies are analogous to your server's processes.
- The number of orders each human-chef is allowed to try to handle at the same time is analogous to the number of threads you assign per process.
- The waiting for food to cook or ingredients to arrive is analogous to your server's processes/threads needing to wait for data to arrive from the database or from some external API before they can prepare their response.
- Your dictating how each plate of food should be prepared is analogous to your application's code and other tools you use (such as caches) to speed up the process of preparing responses.
- Takeaway from the analogy:
- How many processes (human chefs) you should have depends on how many CPU cores (robot bodies) you have and how much waiting around each process (human chef) is likely to be doing while preparing a response (plate of food).
- Note that it can make sense to have more processes (humans) than CPU cores (robot bodies) if the preparation of the responses (plates of food) involves some waiting around.
- How many threads (simultaneous orders) you should set per process (human chef) depends on how much waiting around the chefs are likely to be doing per order.
- If you try to get your server (restaurant) to handle incoming requests (food orders) more quickly by increasing the number of threads (simultaneous orders) per process (human chef), you may end up actually slowing things down if the preparation of the response (plate of food) doesn't involve a lot of waiting around, because the process (human chef) will have its "attention" divided.
- How many processes (human chefs) you should have depends on how many CPU cores (robot bodies) you have and how much waiting around each process (human chef) is likely to be doing while preparing a response (plate of food).
- After thinking about it for a long time, I think the easiest way to understand processes and threads is with an analogy:
Example server situation
- Application Group 1: Python packages A, B, C
- Process Group 1: www.my_first_app.com
- Process 1:
- Thread 1: (PAUSED) Currently handling a request for User 41235; waiting for the database to respond with data.
- Thread 2: (PAUSED) Currently handling a request for User 51234; waiting for an external API to respond with data.
- Thread 3: (PAUSED) Currently waiting for an incoming request.
- Thread 4: (ACTIVE) Currently handling a request for User 21455; executing the code in route "get_products()".
- Process 2:
- (...)
- Process 1:
- Process Group 2: my_second_app.com
- (...)
- Process Group 1: www.my_first_app.com
- Application Group 2: Python packages X, Y, Z
- Process Group 1: resource-heavy-api.my_first_app.com
- (...)
- Process Group 2: my_third_app.net
- (...)
- Process Group 1: resource-heavy-api.my_first_app.com
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论