英文:
Can neo4j create the map automatically from the json file if the relationships are defined in the json file?
问题
我有一个定义节点及其关系的JSON文件。它看起来像这样:
> {"p":{"type":"node","id":"0","labels":["Paintings"],"properties":{"date":"1659-01-01T00:00:00","img":"removed-for-brevity(RFB)","name":"King Caspar","sitelink":"1","description":"RFB","exhibit":"RAB","uri":"RFB"}},"r":{"id":"144","type":"relationship","label":"on_MATERIAL","start":{"id":"0","labels":["Paintings"]},"end":{"id":"2504","labels":["Material"]}},"n":{"type":"node","id":"2504","labels":["Material"],"properties":{"name":"oak","sitelink":5,"description":"RFB","uri":"RFB"}}}
>
“p”是第一个节点,“r”是关系,“n”是第二个节点。
是否可能让neo4j自动从这个JSON文件中创建图/映射,而不必通过cypher手动定义节点和关系?
我对neo4j相对不熟悉,我尝试按照Load JSON页面上提供的示例,但它手动定义了节点和它们的关系,我想避免这样做。
英文:
I have a json file that defines the nodes and their relationships. It looks sometihng like this:
> {"p":{"type":"node","id":"0","labels":["Paintings"],"properties":{"date":"1659-01-01T00:00:00","img":"removed-for-brevity(RFB)","name":"King Caspar","sitelink":"1","description":"RFB","exhibit":"RAB","uri":"RFB"}},"r":{"id":"144","type":"relationship","label":"on_MATERIAL","start":{"id":"0","labels":["Paintings"]},"end":{"id":"2504","labels":["Material"]}},"n":{"type":"node","id":"2504","labels":["Material"],"properties":{"name":"oak","sitelink":5,"description":"RFB","uri":"RFB"}}}
>
"p" is the first node, "r" is the relationship, "n" is the second node.
Is it possible for neo4j to create a graph/map automatically from this json file, without having to define the nodes and relationships through cypher manually?
I am fairly new to neo4j, I tried following the examples given on the Load JSON page, but it defines the nodes and their relationships manually, which i want to avoid.
答案1
得分: 1
不,目前没有自动化的方法,即使有也可能生成的结果对您的使用情况不够优化,甚至可能是错误的。
您需要自己设计图形数据模型(节点标签、关系类型等)。有许多考虑因素(例如您的使用情况、必要的索引和约束),这些因素不会通过简单的JSON数据转储来揭示。此外,您需要理解JSON的架构,并确定如何将其映射到您的数据模型。
英文:
No, there is no automated way, and even if there were the generated result could be suboptimal or even wrong for your use cases.
You need to design the graph data model (node labels, relationship types, etc.) yourself. There are many considerations (like your use cases, and the necessary indexes and constraints) that are not revealed by a simple JSON data dump. Also, you need to understand the schema of the JSON and determine how to map that to your data model.
答案2
得分: 1
这是您提供的代码的翻译部分:
看起来 neo4j 无法自动使用 JSON 文件创建图形数据模型(正如 @cybersam 之前指出的)。
最后,我编写了一个 Python 脚本来为我完成这项任务。我在这里发布它,以防对某人有帮助。它对我来说有效!
from neo4j import GraphDatabase
import json
# 连接到 Neo4j
uri = "bolt://localhost:7687"
username = "_username_"
password = "_password_"
driver = GraphDatabase.driver(uri, auth=(username, password))
processed_painting_ids = set() # 保持一个集合以跟踪唯一绘画节点的 ID
processed_node_ids = set()
# 从文件加载 JSON 数据
with open("data_json.json", "r") as file:
for line in file:
json_data = json.loads(line)
p_data = json_data["p"]
r_data = json_data["r"]
n_data = json_data["n"]
p_unique_id = p_data.get("id") # 跟踪 "p" 节点的 id。
# 处理数据中的缺失值
p_id = str(p_data["id"])
p_date = str(p_data["properties"].get("date", "未知日期"))
p_img = p_data["properties"].get("img", "未知图片")
p_name = p_data["properties"].get("name", "未知名称")
p_sitelink = str(p_data["properties"].get("sitelink", "未知链接"))
p_description = p_data["properties"].get("description", "未知描述")
p_exhibit = p_data["properties"].get("exhibit", "未知展览")
p_uri = str(p_data["properties"].get("uri", "未知 URI"))
r_id = str(r_data["id"])
r_label = r_data["label"]
start_id = str(r_data["start"]["id"])
end_id = str(r_data["end"]["id"])
n_id = str(n_data["id"])
n_name = n_data["properties"].get("name", "未知名称")
n_sitelink = str(n_data["properties"].get("sitelink", "未知链接"))
n_description = n_data["properties"].get("description", "未知描述")
n_uri = n_data["properties"].get("uri", "未知 URI")
with driver.session() as session:
# 创建 "n" 材料节点
if n_id not in processed_node_ids:
session.run(f"CREATE (n:{n_data['labels'][0]} {{id: {n_id}, name: '{n_name}', sitelink: '{n_sitelink}', description: '{n_description}', uri: '{n_uri}'}})")
processed_node_ids.add(n_id)
# 检查 "p" 节点是否重复
if p_unique_id not in processed_painting_ids:
# 创建 "p" 节点
session.run(f"CREATE (p:{p_data['labels'][0]}{{id:{p_id},date:'{p_date}',img:'{p_img}',name:'{p_name}',sitelink:{p_sitelink},description:'{p_description}',exhibit:'{p_exhibit}',uri:'{p_uri}'}}")
# 将节点的 id 添加到集合中
processed_painting_ids.add(p_unique_id)
# 创建 "r" 关系
session.run(f"MATCH (start), (end) WHERE start.id = {start_id} AND end.id = {end_id} CREATE (start)-[r:{r_label} {{id: {r_id}}]->(end)")
希望这对您有帮助!
英文:
It looks like neo4j can't automatically create a graph data model using a json file (as @cybersam pointed out earlier).
I ended up writing a Python script to do this for me. Posting this here just in case it helps someone. It does the job for me!
from neo4j import GraphDatabase
import json
# Connect to Neo4j
uri = "bolt://localhost:7687"
username = "_username_"
password = "_password_"
driver = GraphDatabase.driver(uri, auth=(username, password))
processed_painting_ids = set() #mainting a set to track unique painting node IDs
processed_node_ids = set()
# Load JSON data from file
with open("data_json.json", "r") as file:
for line in file:
json_data = json.loads(line)
p_data = json_data["p"]
r_data = json_data["r"]
n_data = json_data["n"]
p_unique_id = p_data.get("id") #keeps track of the id of the "p" node.
# Handle missing values in the data
p_id = str(p_data["id"])
p_date = str(p_data["properties"].get("date", "Unknown date"))
p_img = p_data["properties"].get("img", "Unknown img")
p_name = p_data["properties"].get("name", "Unknown name")
p_sitelink = str(p_data["properties"].get("sitelink", "Unknown sitelink"))
p_description = p_data["properties"].get("description", "Unknown description")
p_exhibit = p_data["properties"].get("exhibit", "Unknown exhibit")
p_uri = str(p_data["properties"].get("uri", "Unknown uri"))
r_id = str(r_data["id"])
r_label = r_data["label"]
start_id = str(r_data["start"]["id"])
end_id = str(r_data["end"]["id"])
n_id = str(n_data["id"])
n_name = n_data["properties"].get("name", "Unknown name")
n_sitelink = str(n_data["properties"].get("sitelink","Unknown sitelink"))
n_description = n_data["properties"].get("description","Unknown description")
n_uri = n_data["properties"].get("uri","Unknown uri")
with driver.session() as session:
# Create the "n" material node
if n_id not in processed_node_ids:
session.run("CREATE (n:" + n_data["labels"][0] + " {id: " + n_id + ", name: \"" + n_name + "\", sitelink: \"" + n_sitelink + "\", description: \"" + n_description + "\", uri: \"" + uri + "\"})")
processed_node_ids.add(n_id)
# check if the "p" node is repititive
if p_unique_id not in processed_painting_ids:
# Create the "p" node
session.run("CREATE (p:" + p_data["labels"][0] + "{id: "+p_id+",date: \""+p_date+"\", img: \""+p_img+"\", name: \""+p_name+"\", sitelink: " + p_sitelink+", description: \""+p_description+"\", exhibit: \""+p_exhibit+"\", uri: \""+p_uri + "\"})")
# Add id of the node to the set
processed_painting_ids.add(p_unique_id)
# Create the "r" relationship
session.run("MATCH (start), (end) WHERE start.id = "+start_id+" AND end.id = "+end_id+" CREATE (start)-[r:"+r_label+" {id: "+r_id+"}]->(end)")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论