2023年4月4日 18:18:10go评论140阅读模式

英文:

Can Docker automatically create personalized/dynamic containers in run-time?

问题

我正在开发一个应用程序，允许用户提交他们的Python脚本。这些Python脚本将包含语言模型（LM），该应用程序将使用这些LM来计算特定的指标。

我原计划在Docker容器中运行这些脚本，考虑到可扩展性和安全性的问题。它们将作为"黑匣子"，接受输入并返回输出，而应用程序不需要知道容器内部发生了什么。

目前，我只需要让一个概念验证工作，我认为Docker不仅允许用户手动创建容器，还可以自动创建容器。经过数小时的搜索，我相信我错了。我已经了解了一些叫做Kubernetes的东西，但不确定它是否是我所需要的。

所以我的问题很简单：是否可能仅仅使用Docker来实现这个，还是需要学习其他工具如Kubernetes来完成这个任务？

额外的背景信息
我曾考虑过（仅用于概念验证）使用一个Python程序来调用提交的代码，但我不知道如何从提交的代码的导入中安装必要的包。我也不知道如何保持代码运行，因为LM需要保持加载以运行，如果script1.py调用script2.py，它不会一直执行，直到script2.py运行完成，这意味着我必须等待LM每次需要调用其函数时加载。此外，我已经有一个docker-compose.yml文件，它可以自动安装所有依赖项并将提交的Python脚本容器化，但必须手动运行。

如果您想查看一些容器化的代码：

这会创建一个包含充当LM和服务器之间中介的脚本的容器，并自动安装LM的必需依赖项。

FROM python:3.9

WORKDIR /usr/app/src

COPY communicator.py ./
COPY lm_submission.py ./
COPY requirements.txt ./
RUN pip3 install -r requirements.txt

这个docker-compose文件手动创建了"服务器"（提供输入给LM并等待输出的部分）和通信者（理论上，将运行多个通信者以运行多个LM和一个服务器。此外，chatGPT_roberta_model参数将是一个根据正在运行的LM的名称而更改的变量名）。

version: '3.9'

services:
  communicator:
    build: .
    command: sh -c "sleep 2s; python3 ./communicator.py chatGPT_roberta_model"
    environment:
      LISTEN_HOST: server
      LISTEN_PORT: 5555
    ports:
      - '5556:5555'
    depends_on:
      - server

  server:
    build: ./server/
    command: sh -c "python3 ./server.py"
    environment:
      SEND_HOST: server
      SEND_PORT: 5555
    ports:
      - '5555:5555'

英文:

I am working on an application that will allow users to submit their python scripts. These python scripts will contain Language Models (LMs) that will be used by the app to calculate certain metrics.

I was planning on running these scripts inside Docker containers, for scalability and security concerns. They would function as "black-boxes" that would accept an input and return an output without the app needing to know what is going on inside the container.

For now, I simply need to get a proof-of-concept working, and I assumed that Docker allowed users to not only create containers manually, but also automatically. After hours of searching, I believe I was proven wrong. I have read about something called Kubernetes, but I am unsure if this is what I need.

So my question is simple: Is it possible to use only Docker for this, or do I need to learn other tools like Kubernetes to do this?

Additional Context

I've thought about (for just the proof-of-concept) to use a python program that calls the submitted code, but I have no idea how it would install the necessary packages from the submitted code's imports. I also don't know how I would keep the code running, as LMs need to stay loaded to run, and if script1.py calls script2.py, it won't keep executing until script2.py is done running, meaning I would have to wait for the LM to load every time I need to call its functions.
Also, I already have a docker-compose.yml file that automatically installs all the dependencies and containerizes the submitted python scripts, but it must be run manually.

If you would like to look at some of the code for the containerization:

This creates a container with the script that acts as a middleman between the LM and the server, and automatically installs the required dependencies of the LM.

FROM python:3.9

WORKDIR /usr/app/src

COPY communicator.py ./
COPY lm_submission.py ./
COPY requirements.txt ./
RUN pip3 install -r requirements.txt

This docker-compose file manually creates the "server" (the thing that gives inputs to the LMs and waits for outputs) and the communicator (in theory, there would be lots of communicators running multiple LMs and 1 server. Also the chatGPT_roberta_model parameter would be a variable name that changes depending on the name of the LM being run).

version: &#39;3.9&#39;

services:
  communicator:
    build: .
    command: sh -c &quot;sleep 2s; python3 ./communicator.py chatGPT_roberta_model&quot;
    environment:
      LISTEN_HOST: server
      LISTEN_PORT: 5555
    ports:
      - &#39;5556:5555&#39;
    depends_on:
      - server

  server:
    build: ./server/
    command: sh -c &quot;python3 ./server.py&quot;
    environment:
      SEND_HOST: server
      SEND_PORT: 5555
    ports:
      - &#39;5555:5555&#39;

答案1

得分: 0

如果您使用正确的工具，准备您的POC需要的时间相对较短。

我建议您开始探索minikube，这是Kubernetes的轻量级和便携版本，非常适合这种用例。

请查看这里：https://minikube.sigs.k8s.io/docs/start/

显然，公司拥有云提供商或大型Kubernetes集群可用于这些需求；minikube更多地是用于实验室的东西。

（我谈论Kubernetes，因为作为容器编排器，它无疑是最好的选择，最接近可能的生产用例）

然后了解Kubernetes的Deployment和Service资源，以释放您的“黑盒”应用程序。
Deployment将允许您释放一个或多个Pod（应用程序+操作系统+网络的实例），而Service将允许您在K8s集群内部和/或外部使应用程序可访问。

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

https://kubernetes.io/docs/concepts/services-networking/service/

然后，您可以通过遵循一些简单的步骤确保这些应用程序实例（Pods）保持运行并始终对终端用户可用，参见：https://stackoverflow.com/questions/65475195/multi-container-pod-with-command-sleep-k8

最后，我建议尽量不要创建具有活动root用户的容器（如果可能的话），并防止用户通过SSH访问这些Pods；开发一个前端肯定是更好的解决方案。

附注：如果您不知道如何生成Kubernetes清单，您可以简单地将Docker Compose清单转换为Kubernetes清单，详情请参见：https://kubernetes.io/docs/tasks/configure-pod-container/translate-compose-kubernetes/

英文:

If you use the right tools, it takes relatively little time to prepare your POC.

I suggest you start exploring minikube, which is a lite and portable version of Kubernetes perfect for this type of use-case.

Take a look here: https://minikube.sigs.k8s.io/docs/start/

Obviously, the Companies have Cloud Providers or large Kubernetes Clusters of tests available for these needs; minikube is more of something for the laboratory.

(I speak of Kubernetes because, as a container orchestrator, it is certainly the best choice and closest to a possible production use-case)

Then find out about the Kubernetes Deployment and Service resources, in order to release your "black-box" application.
The Deployment will allow you to release 1 or more Pods (instances of your application + OS + Network) and the Service will allow you to make the application reachable inside the K8s Cluster and/or outside.

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

https://kubernetes.io/docs/concepts/services-networking/service/

It will then be possible to ensure that these application instances (Pods) remain up&running and always available for the end-user by following a few simple steps --> https://stackoverflow.com/questions/65475195/multi-container-pod-with-command-sleep-k8

Finally, I suggest not to create containers with active root users (if possible) and to prevent users from accessing these Pods via SSH; surely a better solution could be to develop a front-end.

PS: If you don't know how to generate Kubernetes manifests, you can convert Docker Compose manifests simply --> https://kubernetes.io/docs/tasks/configure-pod-container/translate-compose-kubernetes/

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Docker是否能在运行时自动创建个性化/动态容器？

问题

Additional Context

答案1

如何使通过AWS SSM转发的端口对不是从本地主机发起的连接可用？

使用 Fat Jars 相对于容器的优势：

Loki中的登录日志格式错误。

Kafka Streams状态存储 – 在Kubernetes中运行时要使用哪种存储。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论