英文:
Should the data be validated independently in each microservice?
问题
我有一个使用Python和FastAPI的应用程序,采用微服务架构。因此,我使用Pydantic进行数据验证。这导致了在不同微服务之间复制了许多Pydantic模式。但目前假设我只有两个微服务:“backend”和“ds-backend”。
“backend”将扮演网关的角色,“ds-backend”是负责使用来自“backend”的数据进行一些pandas操作的微服务。
因此,在“backend”中,我有一个Pydantic模式来验证将要提供给前端的数据:
class Monitoring(BaseModel):
type: Optional[EnumType]
average: int
max: int
min: int
通常情况下,我也会在“ds-backend”中拥有相同的模式,用于验证输出数据...(然后我会两次验证数据,一次在“ds-backend”中,一次在“backend”中)。这样做是否合适?或者正确的方法是什么?
一方面,浪费了验证已经经过验证的数据的资源(Pydantic也允许创建无需验证的模式,这也是一种选择)。但另一方面,也许在将来,“ds-backend”应该与其他微服务通信,因此需要定义何处进行验证以及何处不进行验证。
英文:
I have an application in Python and FastAPI with a microservices architecture. So, I perform the data validation with Pydantic. That makes for a lot of pydantic schemas replicated across microservices. But suppose for now I only have two microservices: "backend" and "ds-backend".
"backend" will take the role of gateway and "ds-backend" is a microservice in charge of doing some things with pandas with data that comes from "backend"
So, I have a pydantic schema in "backend" to validate the data that I will serve to the frontend:
class Monitoring(BaseModel):
type: Optional[EnumType]
average: int
max: int
min: int
Normally, I would have this same scheme in "ds-backend" where I also validate the output data...(then I would validate the data twice, once in "ds-backend" and again in "backend"). Should it be done like this? Or what is the correct approach?
On the one hand, resources are wasted validating data that is already validated (Pydantic also allows you to create schemas without validation, it could be applied). But on the other hand, perhaps in the future "ds-backend" should communicate with other microservices, so it would be necessary to define where it is validated and where it is not.
答案1
得分: 1
通常情况下,您应该假设您的微服务是可重用的,可能会被多个来源调用。由于不同团队可能采用不同的方法开发调用方,因此在每个服务中执行输入和输出验证是一个明智的决策,以确保它们能够作为独立的单元工作。
如果您有一些领域实体在多个服务之间共享,您可以考虑以下几种方式:
-
在各个服务中复制逻辑。这样可以使您的服务和团队保持独立(正如预期的那样),但会在多个代码库中复制代码,增加维护成本,并增加您的堆栈风险,因为对这些服务的更改可能不协调。
-
创建一个独立的Python库,可以在多个服务中使用。在这个库中,您可以添加可重用的领域验证。这种方法的一个缺点是多个服务将依赖于这个库,对它的更改可能会对您的堆栈产生重大影响。版本管理在这里变得关键。
-
重新审视您的设计。如果多个微服务处理相同的领域,可能会在您的架构中创建不合适的层次结构,增加不必要的复杂性。也许可以合并服务或重新思考这些服务的职责。例如,如果后端服务充当网关,也许您应该以这种方式编写代码,以处理外部安全性、缓存等方面的问题,然后将领域验证留给您的ds-backend服务(也许还要更改这些名称)。
希望这对您有所帮助。
英文:
Usually, you should assume that your microservice is reusable and may be called from many sources. Since different teams with different approaches may develop the callers, it is an excellent decision to perform input and output validations in each of your services, so that they work as independent units.
If you have some domain entities that are shared across multiple services, you may think of:
-
Replicate the logic throughout the services. This gives independence to your services and teams (as it is supposed to be), but replicates code across multiple code bases, which increases your maintenance costs and adds risk to your stack since changes to these services may be uncoordinated.
-
Creating an independent Python lib that can be used across multiple services. In this lib, you may add the domain validations that can be reused. One downside of this approach is that multiple services will de dependent on this library and changes to it may have a big impact on your stack. Versioning becomes key here.
-
Review your design. If multiple microservices are handling the same domain, you may be creating layers in your architecture that do not suit you well and adds unnecessary complexity to your stack. Maybe joining services or rethinking those services' responsibilities. For instance, if the backend service is acting like a gateway, maybe your should code it that way, so it handles aspects such as external security, caching, and so on and then you leave the domain validation to your ds-backend service (maybe changing those names as well).
Hope this helps
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论