英文:
Neo4j Cypher, exclude relationship from counting in variable-length path
问题
我们可以使用以下模式限制变长的 FIRST|SECOND
路径:
(a)-[r:FIRST|SECOND*..3]->(b)
但是,如何返回包含最多 3 个 SECOND
关系的 FIRST|SECOND
路径最佳方式是什么?
英文:
We can limit a variable-length FIRST|SECOND
path using this pattern:
(a)-[r:FIRST|SECOND*..3]->(b)
But what is the best way to return FIRST|SECOND
paths that contain up to 3 SECOND
relationships?
答案1
得分: 2
这变成了一个非常棘手的问题。您的使用情况不允许对可变长度路径模式设置上限(通常是最佳实践),因为第一个(或第二个,或第三个)SECOND
关系可以出现在任意数量的关系之后。
虽然以下查询不强加上限,但在找到每个SECOND
关系后也不尝试进一步延伸子路径。因此,它避免了不必要地搜索每条可能路径的末端。FIRST*0..
语法使用了零下限,意思是“匹配具有0个或多个连续的FIRST
关系的路径”。
通过将3个独立的子路径查询链接在一起,然后使用 APOC 方法apoc.path.combine 将这些子路径拼接成最终的完整路径,该查询能够找到具有多达3个SECOND
关系的所有路径。NULL
检查用于忽略未找到匹配的子路径。
MATCH (a:Test {testid: 1})
OPTIONAL MATCH p1=(a)-[:FIRST*0..]->()-[:SECOND]->(b)
OPTIONAL MATCH p2=(b)-[:FIRST*0..]->()-[:SECOND]->(c)
OPTIONAL MATCH p3=(c)-[:FIRST*0..]->()-[:SECOND]->(d)
WITH p1, p3,
CASE WHEN p2 IS NOT NULL THEN apoc.path.combine(p1, p2) END AS p12
WITH p1, p12,
CASE WHEN p3 IS NOT NULL THEN apoc.path.combine(p12, p3) END AS p123
WITH (COLLECT(p1) + COLLECT(p12) + COLLECT(p123)) AS paths
UNWIND paths AS p
RETURN p
注意:除非大多数路径至少包含3个SECOND
关系,否则此查询效率不高。如果您的图形不是这样的,那么请提供数据模型的详细信息以及SECOND
节点的密度,然后创建一个新问题。您的数据模型可能需要进行修改。
英文:
This turned out to be a very hard problem. Your use case does not allow there to be an upper bound on the variable-length path pattern (which is normally best practice), because the first (or second, or third) SECOND
relationship could appear after any number of relationships.
Although the following query does not impose any upper bounds, it also does not attempt to extend a subpath any further after finding each SECOND
relationship. So it avoids needlessly searching to the end of every possible path. The FIRST*0..
syntax use a zero lower bound and means "match paths having 0 or more consecutive FIRST
relationships".
This query is able to find all paths with up to 3 SECOND
relationships by chaining together 3 separate subpath queries, and then using the APOC method apoc.path.combine to stitch together the subpaths into the final full paths. The NULL
checks are used to ignore subpaths that found no match.
MATCH (a:Test {testid: 1})
OPTIONAL MATCH p1=(a)-[:FIRST*0..]->()-[:SECOND]->(b)
OPTIONAL MATCH p2=(b)-[:FIRST*0..]->()-[:SECOND]->(c)
OPTIONAL MATCH p3=(c)-[:FIRST*0..]->()-[:SECOND]->(d)
WITH p1, p3,
CASE WHEN p2 IS NOT NULL THEN apoc.path.combine(p1, p2) END AS p12
WITH p1, p12,
CASE WHEN p3 IS NOT NULL THEN apoc.path.combine(p12, p3) END AS p123
WITH (COLLECT(p1) + COLLECT(p12) + COLLECT(p123)) AS paths
UNWIND paths AS p
RETURN p
Caveat: this query is not efficient unless most paths contain at least 3 SECOND
relationships. If that is not what your graph looks like, then create a new question providing details of your data model and how densely it is populated with SECOND
nodes. Your data model may need to be modified.
答案2
得分: 1
以下是翻译好的内容:
a. SECOND
是与 b 相关的最后一个关系。类似这样:
(a)-[:FIRST]->()-[:SECOND]->()-[:SECOND]->()-[:SECOND]->(b)
对于这种情况,您可以尝试以下查询:
MATCH path=(a)-[r:FIRST|SECOND*1..]->()-[:SECOND]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path
在此查询中,我修复了与 b
相关的最后一个关系的类型为 'SECOND',并检查 'SECOND' 关系的计数是否为 3。
b. 另一种情况是,路径的末尾可以连接到 b
的任何关系,只要路径上有 3 个 'SECOND' 关系,类似这样:
(a)-[:SECOND]->()-[:SECOND]->()-[:SECOND]->()-[:FIRST]->(b)
对于这种情况,请尝试以下查询:
MATCH path=(a)-[r:FIRST|SECOND*1..]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path
正如 @cybersam 在评论中指出的,为避免内存错误,您应该在模式匹配中设置适合您目的的上限。如果需要最多 3 个 'SECOND' 关系,那么以下查询将适用:
MATCH path=(a)-[r:FIRST|SECOND*1..10]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) <= 3
RETURN path
请注意,我在这里提供了上限为 10,您可以根据需要进行修改。
英文:
There are two cases I can think of.
a. SECOND
is the last relationship linked to b. Something like this:
(a)-[:FIRST]->()-[:SECOND]->()-[:SECOND]->()-[:SECOND]->(b)
For this, you can try this query:
MATCH path=(a)-[r:FIRST|SECOND*1..]->()-[:SECOND]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path
In this query, I fix the last relationship linked to b
is of type 'SECOND', and we check whether the count of SECOND
relationships is 3.
b. Another case is, any relationship can be linked to b
at the end of the path, provided you have 3 SECOND
relations on the path. Something like this:
(a)-[:SECOND]->()-[:SECOND]->()-[:SECOND]->()-[:FIRST]->(b)
For this case, try the following query:
MATCH path=(a)-[r:FIRST|SECOND*1..]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) = 3
RETURN path
As noted by @cybersam in the comments, to avoid out-of-memory errors, you should put an upper bound in the pattern match, which suits your purpose. Also, if upto 3 SECOND
relationships are required, then this query will work:
MATCH path=(a)-[r:FIRST|SECOND*1..10]->(b)
WHERE SIZE([r1 in relationships(path) WHERE type(r1) = 'SECOND' | r1]) <= 3
RETURN path
Note, I have provided an upper limit of 10 here, you can modify it accordingly.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论