Are multiple vertex labels in Gremlin/Janusgraph possible, or is an alternative solution better?

huangapple go评论57阅读模式
英文:

Are multiple vertex labels in Gremlin/Janusgraph possible, or is an alternative solution better?

问题

  1. 是否有办法让 JanusGraph 接受多个顶点标签?

  2. 如果不可行,或者这不是最佳方法,是否应该添加一个包含标签列表的额外顶点属性?

  3. 对于选项2,在标签名称中,应该是高级标签(Transaction)还是低级标签(Order)?

  4. 关于第一个问题,JanusGraph 是否支持多个顶点标签取决于你的数据模型和具体需求。通常情况下,JanusGraph 支持在顶点上使用多个标签(Labels),这些标签可以用于更丰富的数据建模。你可以将多个标签附加到同一个顶点,以表示不同的特性或分类。但是,具体的实现方式可能需要依赖于 JanusGraph 的版本和配置。

  5. 对于第二个问题,如果 JanusGraph 不直接支持多个顶点标签,你可以考虑在顶点上添加一个额外的属性来存储标签信息。这个属性可以是一个列表或集合,其中包含了顶点的多个标签。这种方法可以让你实现类似于多标签的功能,尽管它不是原生支持的方式。

  6. 关于第三个问题,标签名称应该根据你的数据建模需求来确定。如果你希望以高级标签(Transaction)为主要分类,然后在该标签下包含更多的细分标签(例如,Order),那么你可以在属性中使用高级标签。如果你更关心细分标签,那么可以使用低级标签。选择取决于你的数据查询和分析需求,以及标签的层次结构。

请注意,JanusGraph 的具体配置和数据模型设计可能需要根据你的应用程序的需求进行调整和优化。最佳方法取决于你的具体用例和性能要求。

英文:

I am working on an import runner for a new graph database.

It needs to work with:

  • Amazon Neptune - Gremlin implementation, has great infrastructure support in production, but a pain to work with locally, and does not support Cypher. No visualization tool provided.

  • Janusgraph - easy to work with locally as a Gremlin implementation, but requires heavy investment to support in production, hence using Amazon Neptune. No visualization tool provided.

  • Neo4j - Excellent visualization tool, Cypher language feels very familiar, even works with Gremlin clients, but requires heavy investment to support in production, and there appears to be no visualization tool that is anywhere nearly as good as the one found in Neo4j that works with Gremlin implementations.

So I am creating the graph where the Entity (Nodes/Verticies) have multiple Types (Labels), some being orthogonal to each other, as well as multi-dimensional.

For example, an Entity representing an order made online would be labeled as Order, Online, Spend, Transaction.

             | Spend       Chargeback
----------------------------------------
 Transaction | Purchase    Refund
 Line        | Sale        Return

Zooming into the Spend column.

          | Online      Instore
----------------------------------------
 Purchase | Order       InstorePurchase
 Sale     | OnlineSale  InstoreSale 

In Neo4j and its Cypher query language, this proves to be very powerful for creating Relationships/Edges across multiple types without explicitly knowing what transaction_id values are in the graph :

MATCH (a:Transaction), (b:Line)
WHERE a.transaction_id = b.transaction_id
MERGE (a)<-[edge:TRANSACTED_IN]-(b)
RETURN count(edge);

Problem is, Gremlin/Tinkerpop does not natively support multiple Labels for its Verticies.

Server implementations like AWS Neptune will support this using a delimiter eg. Order::Online::Spend::Transaction and the Gremlin client does support it for a Neo4j server but I haven't been able to find an example where this works for JanusGraph.

Ultimately, I need to be able to run a Gremlin query equivalent to the Cypher one above:

g
  .V().hasLabel("Line").as("b")
  .V().hasLabel("Transaction").as("a")
  .where("b", eq("a")).by("transaction_id")
  .addE("TRANSACTED_IN").from("b").to("a")';

So there are multiple questions here:

  1. Is there a way to make JanusGraph accept multiple vertex labels?
  2. If not possible, or this is not the best approach, should there be an additional vertex property containing a list of labels?
  3. In the case of option 2, should the label name be the high-level label (Transaction) or the low-level label (Order)?

答案1

得分: 3

JanusGraph无法接受多个顶点标签。

如果不可行,或者这不是最佳方法,是否应该有一个附加的顶点属性,包含标签列表?

在选项2的情况下,标签名称应该是高级标签(Transaction)还是低级标签(Order)?

我会将它们合并回答。根据您上面描述的情况,我建议创建一个单一的标签,可能命名为Transaction,并与不同的属性相关联,例如Location(Online或InStore)和Type(Purchase、Refund、Return、Chargeback等)。从您上面的描述来看,您实际上只在谈论一个实体,即Transaction,而您将其他项目用作标签(Online/InStore、Spend/Refund)只是关于Transaction发生方式的附加元数据。因此,上述方法将允许通过一个或多个这些属性进行简单的筛选,以实现在Neo4j中使用多个标签可以完成的任何操作。

英文:

> Is there a way to make JanusGraph accept multiple vertex labels?

No, there is not a way to have multiple vertex labels in JanusGraph.

> If not possible, or this is not the best approach, should there be
> an additional vertex property containing a list of labels?
>
> In the case of option 2, should the label name be the high-level label
> (Transaction) or the low-level label (Order)?

I'll answer these two together. Based on what you have described above I would create a single label, probably named Transaction, and with different properties associated with them such as Location (Online or InStore) and Type (Purchase, Refund, Return, Chargeback, etc.). Looking at how you describe the problem above you are really talking only about a single entity, a Transaction where all the other items you are using as labels (Online/InStore, Spend/Refund) are really just additional metadata about how that Transaction occurred. As such the above approach would allow for simple filtering on one or more of these attributes to achieve anything that could be done with the multiple labels you are using in Neo4j.

huangapple
  • 本文由 发表于 2020年1月7日 02:26:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/59617097.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定