使用哪种数据存储方式当你的实体具有不定数量的字段/属性时?

huangapple go评论120阅读模式
英文:

Which datastore to use when you have unbounded(dynamic) number of fields/attributes for an entity?

问题

我正在设计一个系统,在该系统中,我有一组固定的属性(一个实体),然后每个客户都有一些动态属性。
例如,customer_name、customer_id等是通用属性。
而order_id、patient_number、date_of_joining等是动态属性。

我了解到EVA是一种反模式。我希望在复杂查询中使用mysql和一个nosql数据存储的组合。我已经在使用elastic search。

我不能让映射因字段数量无限增长而爆炸。所以我制定了以下模型:

mysql:

customer, custom_attribute, custom_attribute_mapping, custom_attribute_value

elasticsearch中的嵌套文档数组:

[{
    "field_id" :123,
    "field_type" : "date",
    "value" : "01/01/2020" // 映射类型为date - 在插入数据时从mysql表中引用
}...]

我不能在es上使用扁平映射,因为我希望在自定义字段上也能执行范围查询。
是否有更好的方法?或者是另一种数据库的明显选择,我太幼稚以至于看不到吗?

如果需要修改问题以添加更多信息,我欢迎反馈。

附注:我将拥有大量数据(数量在数千万条记录以上)

英文:

I am designing a system where I have a fixed set of attributes (an entity) and then some dynamic attributes per client.
e.g. customer_name, customer_id etc are common attributes.
whereas order_id, patient_number, date_of_joining etc are dynamic attributes.

I read about EVA being an anti-pattern. I wish to use a combination of mysql and a nosql datastore for complex queries. I already use elastic search.

I cannot let the mapping explode with unlimited number of fields. So I have devised the following model:

mysql :

customer, custom_attribute, custom_attribute_mapping, custom_attribute_value

array of nested documents in elasticsearch :

[{
    "field_id" :123,
    "field_type" : "date",
    "value" : "01/01/2020" // mapping type date - referred from mysql table at time on inserting data
}...]

I cannot use flattened mappings on es, as I wish to use range queries as well on custom fields.
Is there a better way to do it? Or an obvious choice of another database that I am too naive to see?

If I need to modify the question to add more info, I'd welcome the feedback.

P.S. : I will have large data (order in 10s of millions of records)

答案1

得分: 2

为什么不使用像mongoDB这样的纯NoSQL数据库呢?
或者作为不太流行的解决方案,我建议使用三元存储,如virtuoso或其他类似的存储方式。然后,您可以使用SPARQL作为查询语言,并且有许多针对这些存储的驱动程序,例如Java的Jena

三元存储允许您以“<主题 谓词 对象>”的格式存储数据,其中您的情况下主题是客户ID,谓词是属性,对象将是值。所有标准和动态属性都将在同一张表中。

三元存储可以在任何数据库管理系统中建模为具有3列的表。

英文:

Why not using something like mongoDB as a pure NoSQL database.
Or as non-popular solution, I would recommend triple stores such as virtuoso or any other similar ones. Then you can use SPARQL as a query language over them and there are many drivers for such stores, e.g. Jena for Java.

Triples stores allow you to store data in the format of &lt;Subject predicate object&gt;
wherein your case subject is the customer id, predicates are the attributes and object will be the value. All standard and dynamic attributes will be in the same table.

Triple stores can be modeled as 3 columns table in any database management system.

huangapple
  • 本文由 发表于 2020年7月21日 18:31:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/63012577.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定