Neo4j Cypher,排除在可变长度路径中计数的关系。

huangapple go评论53阅读模式
英文:

Neo4j Cypher, exclude relationship from counting in variable-length path

问题

我们可以使用以下模式限制变长的 FIRST|SECOND 路径:

(a)-[r:FIRST|SECOND*..3]->(b)

但是,如何返回包含最多 3 个 SECOND 关系的 FIRST|SECOND 路径最佳方式是什么?

英文:

We can limit a variable-length FIRST|SECOND path using this pattern:

(a)-[r:FIRST|SECOND*..3]->(b)

But what is the best way to return FIRST|SECOND paths that contain up to 3 SECOND relationships?

答案1

得分: 2

这变成了一个非常棘手的问题。您的使用情况不允许对可变长度路径模式设置上限(通常是最佳实践),因为第一个(或第二个,或第三个)SECOND关系可以出现在任意数量的关系之后。

虽然以下查询不强加上限,但在找到每个SECOND关系后也不尝试进一步延伸子路径。因此,它避免了不必要地搜索每条可能路径的末端FIRST*0.. 语法使用了零下限,意思是“匹配具有0个或多个连续的FIRST关系的路径”。

通过将3个独立的子路径查询链接在一起,然后使用 APOC 方法apoc.path.combine 将这些子路径拼接成最终的完整路径,该查询能够找到具有多达3个SECOND关系的所有路径。NULL检查用于忽略未找到匹配的子路径。

MATCH (a:Test {testid: 1})
OPTIONAL MATCH p1=(a)-[:FIRST*0..]->()-[:SECOND]->(b)
OPTIONAL MATCH p2=(b)-[:FIRST*0..]->()-[:SECOND]->(c)
OPTIONAL MATCH p3=(c)-[:FIRST*0..]->()-[:SECOND]->(d)
WITH p1, p3,
  CASE WHEN p2 IS NOT NULL THEN apoc.path.combine(p1, p2) END AS p12
WITH p1, p12,
  CASE WHEN p3 IS NOT NULL THEN apoc.path.combine(p12, p3) END AS p123
WITH (COLLECT(p1) + COLLECT(p12) + COLLECT(p123)) AS paths
UNWIND paths AS p
RETURN p

注意:除非大多数路径至少包含3个SECOND关系,否则此查询效率不高。如果您的图形不是这样的,那么请提供数据模型的详细信息以及SECOND节点的密度,然后创建一个新问题。您的数据模型可能需要进行修改。

英文:

This turned out to be a very hard problem. Your use case does not allow there to be an upper bound on the variable-length path pattern (which is normally best practice), because the first (or second, or third) SECOND relationship could appear after any number of relationships.

Although the following query does not impose any upper bounds, it also does not attempt to extend a subpath any further after finding each SECOND relationship. So it avoids needlessly searching to the end of every possible path. The FIRST*0.. syntax use a zero lower bound and means "match paths having 0 or more consecutive FIRST relationships".

This query is able to find all paths with up to 3 SECOND relationships by chaining together 3 separate subpath queries, and then using the APOC method apoc.path.combine to stitch together the subpaths into the final full paths. The NULL checks are used to ignore subpaths that found no match.

MATCH (a:Test {testid: 1})
OPTIONAL MATCH p1=(a)-[:FIRST*0..]->()-[:SECOND]->(b)
OPTIONAL MATCH p2=(b)-[:FIRST*0..]->()-[:SECOND]->(c)
OPTIONAL MATCH p3=(c)-[:FIRST*0..]->()-[:SECOND]->(d)
WITH p1, p3,
  CASE WHEN p2 IS NOT NULL THEN apoc.path.combine(p1, p2) END AS p12
WITH p1, p12,
  CASE WHEN p3 IS NOT NULL THEN apoc.path.combine(p12, p3) END AS p123
WITH (COLLECT(p1) + COLLECT(p12) + COLLECT(p123)) AS paths
UNWIND paths AS p
RETURN p

Caveat: this query is not efficient unless most paths contain at least 3 SECOND relationships. If that is not what your graph looks like, then create a new question providing details of your data model and how densely it is populated with SECOND nodes. Your data model may need to be modified.

答案2

得分: 1

以下是翻译好的内容:

a. SECOND 是与 b 相关的最后一个关系。类似这样:

(a)-[:FIRST]->()-[:SECOND]->()-[:SECOND]->()-[:SECOND]->(b)

对于这种情况,您可以尝试以下查询:

MATCH path=(a)-[r:FIRST|SECOND*1..]->()-[:SECOND]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path

在此查询中,我修复了与 b 相关的最后一个关系的类型为 'SECOND',并检查 'SECOND' 关系的计数是否为 3。

b. 另一种情况是,路径的末尾可以连接到 b 的任何关系,只要路径上有 3 个 'SECOND' 关系,类似这样:

(a)-[:SECOND]->()-[:SECOND]->()-[:SECOND]->()-[:FIRST]->(b)

对于这种情况,请尝试以下查询:

MATCH path=(a)-[r:FIRST|SECOND*1..]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path

正如 @cybersam 在评论中指出的,为避免内存错误,您应该在模式匹配中设置适合您目的的上限。如果需要最多 3 个 'SECOND' 关系,那么以下查询将适用:

MATCH path=(a)-[r:FIRST|SECOND*1..10]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) <= 3
RETURN path

请注意,我在这里提供了上限为 10,您可以根据需要进行修改。

英文:

There are two cases I can think of.

a. SECOND is the last relationship linked to b. Something like this:

(a)-[:FIRST]-&gt;()-[:SECOND]-&gt;()-[:SECOND]-&gt;()-[:SECOND]-&gt;(b)

For this, you can try this query:

MATCH path=(a)-[r:FIRST|SECOND*1..]-&gt;()-[:SECOND]-&gt;(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = &#39;SECOND&#39; | r1]) = 3
RETURN path

In this query, I fix the last relationship linked to b is of type 'SECOND', and we check whether the count of SECOND relationships is 3.

b. Another case is, any relationship can be linked to b at the end of the path, provided you have 3 SECOND relations on the path. Something like this:

(a)-[:SECOND]-&gt;()-[:SECOND]-&gt;()-[:SECOND]-&gt;()-[:FIRST]-&gt;(b)

For this case, try the following query:

 MATCH path=(a)-[r:FIRST|SECOND*1..]-&gt;(b)
 WHERE SIZE([r1 in relationships(path) WHERE type(r1) = &#39;SECOND&#39; | r1]) = 3
 RETURN path

As noted by @cybersam in the comments, to avoid out-of-memory errors, you should put an upper bound in the pattern match, which suits your purpose. Also, if upto 3 SECOND relationships are required, then this query will work:

 MATCH path=(a)-[r:FIRST|SECOND*1..10]-&gt;(b)
 WHERE SIZE([r1 in relationships(path) WHERE type(r1) = &#39;SECOND&#39; | r1]) &lt;= 3
 RETURN path

Note, I have provided an upper limit of 10 here, you can modify it accordingly.

huangapple
  • 本文由 发表于 2023年3月15日 18:03:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75743172.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定