在 Gremlin 中通过单个查询获取子代和孙代。

huangapple go评论48阅读模式
英文:

Getting children and grandchildren in a single query in Gremlin

问题

我目前正在将一些Cypher查询重写为Gremlin。我想创建一个单一查询,针对特定的起始节点,返回以下内容:

  1. 到子节点 - 基于名为'prob'的边属性。我们希望获取最大概率的最多10个子节点(按边属性'prob'按降序排序)。
  2. 对于每个子节点,我们想继续并获取最高概率的最多10个孙子节点(类似于第1点)。

我附上了一个显示我们想要的结果的图像 - 但假设为简单起见最多获取2个节点而不是10个。作为结果,我们还希望获取所有子节点和孙子节点的属性。

谢谢!

在 Gremlin 中通过单个查询获取子代和孙代。

编辑:
我想出了以下解决方案。也许有人可以指出更好的方法,但似乎该查询返回了正确的结果。

g.V('123')
  .inE().order().by('prob', Order.desc).limit(10).outV().as('c')
  .project('c', 'gc').by(valueMap(true)).by(inE().order().by('prob', Order.desc).limit(10).outV().valueMap(true).fold())
英文:

I am currently rewriting some of the queries written in Cypher to Gremlin.
I want to create a single query that would return for the specific starting node:

  1. up to the children - based on the edge property called 'prob.' We want to get up to 10 children with biggest probability (sorted by edge property 'prob' in desc order).
  2. For every child we would like to continue and get up to 10 grandchildren with the highest probability (similar to point 1).

I attached an image showing the result we would like to get - but with the assumption of getting up to 2 nodes instead of 10 for simplicity. As the result, we would also like to get all the properties of children and grandchildren.

Thank you!
在 Gremlin 中通过单个查询获取子代和孙代。

Edit:
I came up with the following solution. Maybe someone point out a better approach but is seems the query returns the correct result.

g.V('123')
  .inE().order().by('prob', Order.desc).limit(10).outV().as('c')
  .project('c', 'gc').by(valueMap(true)).by(inE().order().by('prob', Order.desc).limit(10).outV().valueMap(true).fold())

答案1

得分: 1

你可以在这种情况下使用 local 步骤,对于每个子节点,您希望有限数量的子子节点。由于我没有您的数据集,以下是一些我认为与您的用例很好匹配的示例。在查看边缘之前,这里是一个基本示例,展示了如何使用 local 可以帮助(我只使用了限制为2,以保持简单)。

g.V('44').
  out().limit(2).
  local(out().limit(2)).
  path().
    by('code')

这将产生以下结果:

1	path[SAF, DFW, ASE]
2	path[SAF, DFW, GEG]
3	path[SAF, LAX, YLW]
4	path[SAF, LAX, ASE]

在航空路线数据集中,每条边都有一个 "dist" 属性(路线距离),我们可以使用它来模拟您的用例。

g.V('44').
  outE('route').order().by('dist',desc).inV().limit(2).
  local(outE('route').order().by('dist',desc).inV().limit(2)).
  path().
    by('code').
    by('dist')

我们可以看到它选择了最长的路线:

1	path[SAF, 708, LAX, 8756, SIN]
2	path[SAF, 708, LAX, 8372, AUH]
3	path[SAF, 549, DFW, 8574, SYD]
4	path[SAF, 549, DFW, 8105, HKG]
英文:

You should be able to use the local step in a case like this where, for each child you want a limited number of grandchildren. As I do not have your data set, here are some examples that I think map well to your use case. Before looking at the edges, here is a basic example that shows how local can help (I used a limit of 2 just to keep things simple).

g.V('44').
  out().limit(2).
  local(out().limit(2)).
  path().
    by('code')

This yields

1	path[SAF, DFW, ASE]
2	path[SAF, DFW, GEG]
3	path[SAF, LAX, YLW]
4	path[SAF, LAX, ASE]

In the air routes data set, each edge has a "dist" property (the route distance), we can use that to simulate your use case.

g.V('44').
  outE('route').order().by('dist',desc).inV().limit(2).
  local(outE('route').order().by('dist',desc).inV().limit(2)).
  path().
    by('code').
    by('dist')

Which we can see picks the longest routes

1	path[SAF, 708, LAX, 8756, SIN]
2	path[SAF, 708, LAX, 8372, AUH]
3	path[SAF, 549, DFW, 8574, SYD]
4	path[SAF, 549, DFW, 8105, HKG]

huangapple
  • 本文由 发表于 2023年2月19日 18:55:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75499617.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定