Neo4j是否可以根据JSON文件中定义的关系自动创建地图?

huangapple go评论69阅读模式
英文:

Can neo4j create the map automatically from the json file if the relationships are defined in the json file?

问题

我有一个定义节点及其关系的JSON文件。它看起来像这样:

> {"p":{"type":"node","id":"0","labels":["Paintings"],"properties":{"date":"1659-01-01T00:00:00","img":"removed-for-brevity(RFB)","name":"King Caspar","sitelink":"1","description":"RFB","exhibit":"RAB","uri":"RFB"}},"r":{"id":"144","type":"relationship","label":"on_MATERIAL","start":{"id":"0","labels":["Paintings"]},"end":{"id":"2504","labels":["Material"]}},"n":{"type":"node","id":"2504","labels":["Material"],"properties":{"name":"oak","sitelink":5,"description":"RFB","uri":"RFB"}}}
>
“p”是第一个节点,“r”是关系,“n”是第二个节点。

是否可能让neo4j自动从这个JSON文件中创建图/映射,而不必通过cypher手动定义节点和关系?

我对neo4j相对不熟悉,我尝试按照Load JSON页面上提供的示例,但它手动定义了节点和它们的关系,我想避免这样做。

英文:

I have a json file that defines the nodes and their relationships. It looks sometihng like this:

> {"p":{"type":"node","id":"0","labels":["Paintings"],"properties":{"date":"1659-01-01T00:00:00","img":"removed-for-brevity(RFB)","name":"King Caspar","sitelink":"1","description":"RFB","exhibit":"RAB","uri":"RFB"}},"r":{"id":"144","type":"relationship","label":"on_MATERIAL","start":{"id":"0","labels":["Paintings"]},"end":{"id":"2504","labels":["Material"]}},"n":{"type":"node","id":"2504","labels":["Material"],"properties":{"name":"oak","sitelink":5,"description":"RFB","uri":"RFB"}}}
>
"p" is the first node, "r" is the relationship, "n" is the second node.

Is it possible for neo4j to create a graph/map automatically from this json file, without having to define the nodes and relationships through cypher manually?

I am fairly new to neo4j, I tried following the examples given on the Load JSON page, but it defines the nodes and their relationships manually, which i want to avoid.

答案1

得分: 1

不,目前没有自动化的方法,即使有也可能生成的结果对您的使用情况不够优化,甚至可能是错误的。

您需要自己设计图形数据模型(节点标签、关系类型等)。有许多考虑因素(例如您的使用情况、必要的索引和约束),这些因素不会通过简单的JSON数据转储来揭示。此外,您需要理解JSON的架构,并确定如何将其映射到您的数据模型。

英文:

No, there is no automated way, and even if there were the generated result could be suboptimal or even wrong for your use cases.

You need to design the graph data model (node labels, relationship types, etc.) yourself. There are many considerations (like your use cases, and the necessary indexes and constraints) that are not revealed by a simple JSON data dump. Also, you need to understand the schema of the JSON and determine how to map that to your data model.

答案2

得分: 1

这是您提供的代码的翻译部分:

看起来 neo4j 无法自动使用 JSON 文件创建图形数据模型正如 @cybersam 之前指出的)。

最后我编写了一个 Python 脚本来为我完成这项任务我在这里发布它以防对某人有帮助它对我来说有效

from neo4j import GraphDatabase
import json

# 连接到 Neo4j
uri = "bolt://localhost:7687"
username = "_username_"
password = "_password_"

driver = GraphDatabase.driver(uri, auth=(username, password))

processed_painting_ids = set()  # 保持一个集合以跟踪唯一绘画节点的 ID
processed_node_ids = set()

# 从文件加载 JSON 数据
with open("data_json.json", "r") as file:
    for line in file:
        json_data = json.loads(line)

        p_data = json_data["p"]
        r_data = json_data["r"]
        n_data = json_data["n"]

        p_unique_id = p_data.get("id")  # 跟踪 "p" 节点的 id。

        # 处理数据中的缺失值
        p_id = str(p_data["id"])
        p_date = str(p_data["properties"].get("date", "未知日期"))
        p_img = p_data["properties"].get("img", "未知图片")
        p_name = p_data["properties"].get("name", "未知名称")
        p_sitelink = str(p_data["properties"].get("sitelink", "未知链接"))
        p_description = p_data["properties"].get("description", "未知描述")
        p_exhibit = p_data["properties"].get("exhibit", "未知展览")
        p_uri = str(p_data["properties"].get("uri", "未知 URI"))

        r_id = str(r_data["id"])
        r_label = r_data["label"]
        start_id = str(r_data["start"]["id"])
        end_id = str(r_data["end"]["id"])

        n_id = str(n_data["id"])
        n_name = n_data["properties"].get("name", "未知名称")
        n_sitelink = str(n_data["properties"].get("sitelink", "未知链接"))
        n_description = n_data["properties"].get("description", "未知描述")
        n_uri = n_data["properties"].get("uri", "未知 URI")

        with driver.session() as session:

            # 创建 "n" 材料节点
            if n_id not in processed_node_ids:
                session.run(f"CREATE (n:{n_data['labels'][0]} {{id: {n_id}, name: '{n_name}', sitelink: '{n_sitelink}', description: '{n_description}', uri: '{n_uri}'}})")
                processed_node_ids.add(n_id)
            # 检查 "p" 节点是否重复
            if p_unique_id not in processed_painting_ids:
                # 创建 "p" 节点
                session.run(f"CREATE (p:{p_data['labels'][0]}{{id:{p_id},date:'{p_date}',img:'{p_img}',name:'{p_name}',sitelink:{p_sitelink},description:'{p_description}',exhibit:'{p_exhibit}',uri:'{p_uri}'}}")
                # 将节点的 id 添加到集合中
                processed_painting_ids.add(p_unique_id)
            # 创建 "r" 关系
            session.run(f"MATCH (start), (end) WHERE start.id = {start_id} AND end.id = {end_id} CREATE (start)-[r:{r_label} {{id: {r_id}}]->(end)")

希望这对您有帮助!

英文:

It looks like neo4j can't automatically create a graph data model using a json file (as @cybersam pointed out earlier).

I ended up writing a Python script to do this for me. Posting this here just in case it helps someone. It does the job for me!

from neo4j import GraphDatabase
import json
# Connect to Neo4j
uri = "bolt://localhost:7687"
username = "_username_"
password = "_password_"
driver = GraphDatabase.driver(uri, auth=(username, password))
processed_painting_ids = set() #mainting a set to track unique painting node IDs
processed_node_ids = set()
# Load JSON data from file
with open("data_json.json", "r") as file:
for line in file:
json_data = json.loads(line)
p_data = json_data["p"]
r_data = json_data["r"]
n_data = json_data["n"]
p_unique_id = p_data.get("id") #keeps track of the id of the "p" node. 
# Handle missing values in the data
p_id = str(p_data["id"])
p_date = str(p_data["properties"].get("date", "Unknown date"))
p_img = p_data["properties"].get("img", "Unknown img")
p_name = p_data["properties"].get("name", "Unknown name")
p_sitelink = str(p_data["properties"].get("sitelink", "Unknown sitelink"))
p_description = p_data["properties"].get("description", "Unknown description")
p_exhibit = p_data["properties"].get("exhibit", "Unknown exhibit")
p_uri = str(p_data["properties"].get("uri", "Unknown uri"))
r_id = str(r_data["id"])
r_label = r_data["label"]
start_id = str(r_data["start"]["id"])
end_id = str(r_data["end"]["id"])
n_id = str(n_data["id"])
n_name = n_data["properties"].get("name", "Unknown name")
n_sitelink = str(n_data["properties"].get("sitelink","Unknown sitelink"))
n_description = n_data["properties"].get("description","Unknown description")
n_uri = n_data["properties"].get("uri","Unknown uri")
with driver.session() as session:
# Create the "n" material node
if n_id not in processed_node_ids:
session.run("CREATE (n:" + n_data["labels"][0] + " {id: " + n_id + ", name: \"" + n_name + "\", sitelink: \"" + n_sitelink + "\", description: \"" + n_description + "\", uri: \"" + uri + "\"})")
processed_node_ids.add(n_id)
# check if the "p" node is repititive
if p_unique_id not in processed_painting_ids:
# Create the "p" node
session.run("CREATE (p:" + p_data["labels"][0] + "{id: "+p_id+",date: \""+p_date+"\", img: \""+p_img+"\", name: \""+p_name+"\", sitelink: " + p_sitelink+", description: \""+p_description+"\", exhibit: \""+p_exhibit+"\", uri: \""+p_uri + "\"})") 
# Add id of the node to the set
processed_painting_ids.add(p_unique_id)
# Create the "r" relationship
session.run("MATCH (start), (end) WHERE start.id = "+start_id+" AND end.id = "+end_id+" CREATE (start)-[r:"+r_label+" {id: "+r_id+"}]->(end)")

huangapple
  • 本文由 发表于 2023年6月22日 01:49:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76525941.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定