从Neo4j使用Python驱动程序获取元素以及它们的ID。

huangapple go评论58阅读模式
英文:

Get elements along with their ids from neo4j using python driver

问题

我正在使用Python的Neo4j驱动程序(版本5.5.0)从Neo4j Aura数据库中查询数据,以供我的Python应用程序使用。然而,这些数据对于我的目的(构建图形可视化)不足,因为它只返回节点属性而不返回ID/标签。边的情况也一样,没有ID,也没有连接节点的ID(尽管它确实提供了连接节点的属性)。然而,我确实希望有ID,以确保图形是一致的。

我有以下的Neo4j Cypher查询:

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" UNWIND d as doc",
f"  CALL {{",
f"      WITH doc",
f"      MATCH (doc:Document)-[de:CONTAINS_ENTITY]->(e:Entity_node)-[ec:ENTITY_CONCEPT_ASSOCIATION]->(c: Concept)",
f"      WITH doc, ",
f"      {{ sub: doc, rel_id: id(de), rel_type: type(de), obj: e }} as contains,",
f"      {{ sub: e, rel_id: id(ec), rel_type: type(ec), obj: c }} as represents LIMIT {num_of_ents}",
f"      MATCH (a:Author)-[ad:AUTHORED]->(doc)",
f"      RETURN", 
f"          {{ contains: contains, represents: represents }} as entities,",
f"          {{ sub: a, rel_id: id(ad), rel_type: type(ad), obj: doc }} as authors",
f"  }}",
f" RETURN doc, collect(entities) as entities, collect(authors) as authors"

我将这个查询传递给Neo4j驱动程序的session.run()函数。

这个查询可能有点复杂,但不太相关,所以让我们假设这是查询:

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" RETURN d"

这将返回类似以下的响应:

[
    {
        "d": {
            "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
            "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
        }
    },
    ...
]

然而,在Neo4j浏览器中,相同的查询会返回不同格式的响应:

[
  {
    "identity": 23016,
    "labels": [
      "Document"
    ],
    "properties": {
      "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
      "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
    },
    "elementId": "23016"
  },
  ...
]

这些响应包含节点的ID和标签,这对我的应用程序是必要的。此外,关系还包含起始和结束ID。

您如何使用Neo4j Python驱动程序获取这些值?我已经尝试了Result和Record对象上可用的所有函数(data()、values()、items()),但都没有提供ID/标签信息。graph()函数提供了带有ID的节点,但没有关系(空列表)。

我知道可以在Cypher查询中使用id()和labels()函数来获取这些信息,但考虑到我的查询规模,这似乎会显著增加响应时间。

graph()函数具有ID,这告诉我最初的Result对象中确实包含ID。我该如何访问它们?

英文:

I am using the python neo4j driver (5.5.0) to query for data from a neo4j aura database for use in my python application. However this data is insufficient for my purpose (building a graph visualization) as it only returns the node properties and not ids/labels. Same for edges, no id, no ids for connected nodes (it does give the connected node properties though). However I would really like to have the ids to ensure the graphs are consistent.

I have the following neo4j Cypher query

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" UNWIND d as doc",
f"  CALL {{",
f"      WITH doc",
f"      MATCH (doc:Document)-[de:CONTAINS_ENTITY]->(e:Entity_node)-[ec:ENTITY_CONCEPT_ASSOCIATION]->(c: Concept)",
f"      WITH doc, ",
f"      {{ sub: doc, rel_id: id(de), rel_type: type(de), obj: e }} as contains,",
f"      {{ sub: e, rel_id: id(ec), rel_type: type(ec), obj: c }} as represents LIMIT {num_of_ents}"
f"      MATCH (a:Author)-[ad:AUTHORED]->(doc)",
f"      RETURN", 
f"          {{ contains: contains, represents: represents }} as entities,",
f"          {{ sub: a, rel_id: id(ad), rel_type: type(ad), obj: doc }} as authors",
f"  }}",
f" RETURN doc, collect(entities) as entities, collect(authors) as authors"

that I am passing to the neo4j drivers session.run() function

It is a bit complex and largely irrelevant so lets say this is the query

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" RETURN d"

This would return a response like..

[
    {
        "d": {
            "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
            "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
        }
    },
    ...
]

However the same query would return a different format of response in neo4j browser

[
  {
    "identity": 23016,
    "labels": [
      "Document"
    ],
    "properties": {
      "title": "Coronavirus and paramyxovirus in bats from Northwest Italy",
      "doc_id": "a03517f26664be79239bcdf3dbb0966913206a86"
    },
    "elementId": "23016"
  },
  ...
]

These responses contain the id of the node as well as the labels which are necessary for my application. Moreover the relations contain start and end ids too.

How do I get these values using the neo4j python driver. I have tried all the functions available on the Result & Record objects [data(), values(), items()] but none give the ids/labels. The graph() function gives nodes with ids but no relations at all (empty list).

I am aware of using the id() and labels() functions in the Cypher query itself but considering the size of my query that seems to increase the response time considerably.

The fact that the graph() function has ids tells me that the initial Result object has the ids somewhere within. How can I access it?

答案1

得分: 0

"id" 和 "labels" 属性由 Neo4j 数据库管理。要访问这些属性,您需要使用 Cypher 内置函数,这些函数在文档 https://neo4j.com/docs/cypher-manual/current/functions/ 中有描述。

根据您提供的简化示例,获取标签和ID的查询如下:

MATCH (d:Document),
WHERE d.doc_id IN {doc_ids},
WITH *, labels(d) as doc_labels,
WITH *, id(d) as doc_id,
RETURN doc_id, doc_labels

或者假设"您"的 doc_id 是要搜索的节点ID,您只想在 WHERE 子句中与外部参数 "doc_ids" 进行比较:

MATCH (d:Document),
WHERE id(d) IN {doc_ids},
WITH *, labels(d) as doc_labels,
WITH *, id(d) as doc_id,
RETURN doc_id, doc_labels

标签以"字符串列表"的形式返回(这是 Cypher 的数据类型),其中包含附加在节点上的所有标签。ID 是整数值。

查询的结果应如下所示:

doc_id        doc_labels        
0             ["Document", "second_label"]
1             ["Document"]

问候,
ottonormal

英文:

the properties "id" and "labels" are managed by the neo4j database.
For access this properties you have to use a build in cypher function which are described in the doc's https://neo4j.com/docs/cypher-manual/current/functions/.

According to your simplified example, getting the labels and the id would looks like this:

f" MATCH (d:Document)",
f" WHERE d.doc_id IN {doc_ids}",
f" WITH *, labels(d) as doc_labels",
f" WITH *, id(d) as doc_id",
f" RETURN doc_id, doc_labels"

or assuming that "your" doc_id is the searched node id and you just want to compare it in the WHERE Clause with an external parameter "doc_ids":

f" MATCH (d:Document)",
f" WHERE id(d) IN {doc_ids}",
f" WITH *, labels(d) as doc_labels",
f" WITH *, id(d) as doc_id",
f" RETURN doc_id, doc_labels"

The labels come back in form of an 'list of strings' (a data type of cypher) which contains all labels attached at the node. The id is a integer value.

The result of the query should looks like this:

doc_id        doc_labels        
0             ["Document", "second_label"]
1             ["Document"]

Greetings,
ottonormal

huangapple
  • 本文由 发表于 2023年2月14日 22:41:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75449419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定