英文:
Loading nodes CSV to AGE with provided IDs returns "label_id must be 1 ... 65535"
问题
我有一个CSV文件,它的格式不符合AGE加载的正确方式。我的任务是将其转换为一个新文件,以便AGE能够读取它并创建节点,就像文档中指定的那样。为此,我创建了一个Python脚本,它创建一个新文件,连接到PostgreSQL,并执行查询。我认为这可能很有用,因为如果有人有CSV文件,并希望创建节点和边并将其发送到AGE,但文件不符合指定的格式,这个脚本可以快速解决这个问题。
以下是旧的CSV文件(ProductsData.csv),其中包含其他用户购买的产品数据(通过其user_id
标识),购买产品的商店(通过其store_id
标识),以及product_id
,它是节点的id
:
product_name,price,description,store_id,user_id,product_id
iPhone 12,999,"Apple iPhone 12 - 64GB, Space Gray",1234,1001,123
Samsung Galaxy S21,899,"Samsung Galaxy S21 - 128GB, Phantom Black",5678,1002,124
AirPods Pro,249,"Apple AirPods Pro with Active Noise Cancellation",1234,1003,125
Sony PlayStation 5,499,"Sony PlayStation 5 Gaming Console, 1TB",9012,1004,126
以下是Python文件:
import psycopg2
import age
import csv
def read_csv(csv_file):
with open(csv_file, 'r') as file:
reader = csv.reader(file)
rows = list(reader)
return rows
def create_csv(csv_file):
new_header = ['id', 'product_name', 'description', 'price', 'store_id', 'user_id']
property_order = [5, 0, 2, 1, 3, 4] # 重新排序属性。
rows = read_csv(csv_file)
new_csv_file = 'products.csv'
with open(new_csv_file, 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(new_header)
# 用重新排序的属性写入每一行。
for row in rows[1:]:
new_row = [row[i] for i in property_order]
writer.writerow(new_row)
print(f"已创建具有所需格式的新CSV文件 '{new_csv_file}'。")
def load_csv_nodes(csv_file, graph_name, conn):
with conn.cursor() as cursor:
try :
cursor.execute("LOAD 'age';")
cursor.execute("SET search_path = ag_catalog, '$user', public;")
cursor.execute("SELECT load_labels_from_file(%s, 'Node', %s)", (graph_name, csv_file,))
conn.commit()
except Exception as ex:
print(type(ex), ex)
conn.rollback()
def main():
csv_file = 'ProductsData.csv'
create_csv(csv_file)
new_csv_file = 'products.csv'
GRAPH_NAME = 'csv_test_graph'
conn = psycopg2.connect(host="localhost", port="5432", dbname="database", user="user", password="password")
age.setUpAge(conn, GRAPH_NAME)
path_to_csv = '/path/to/folder/' + new_csv_file
load_csv_nodes(path_to_csv, GRAPH_NAME, conn)
main()
生成的文件:
id,product_name,description,price,store_id,user_id
123,iPhone 12,"Apple iPhone 12 - 64GB, Space Gray",999,1234,1001
124,Samsung Galaxy S21,"Samsung Galaxy S21 - 128GB, Phantom Black",899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,"Sony PlayStation 5 Gaming Console, 1TB",499,9012,1004
但是,当运行脚本时,它显示以下消息:
<class 'psycopg2.errors.InvalidParameterValue'> label_id must be 1 .. 65535
这些ID被设置在1到65535之间,我不明白为什么会显示这个错误消息。
英文:
I have a csv file that is not formatted in the correct way for AGE to load. I was on the task to transform it into a new one so that AGE could read it and create nodes, like it is specified in the documentation. For that, I created a python script that creates a new file, connects to postgres, and performs the queries. I though this could be useful since if someone had csv files and wanted to create nodes and edges and send it to AGE, but it was not in the specified format, this could be used to quickly solve the problem.
Here is the old csv file (ProductsData.csv), it contains the data of products that have been purchased by other users (identified by their user_id
), the store where the product was purchased from (identified by their store_id
), and also the product_id
, which is the id
of the node:
product_name,price,description,store_id,user_id,product_id
iPhone 12,999,"Apple iPhone 12 - 64GB, Space Gray",1234,1001,123
Samsung Galaxy S21,899,"Samsung Galaxy S21 - 128GB, Phantom Black",5678,1002,124
AirPods Pro,249,"Apple AirPods Pro with Active Noise Cancellation",1234,1003,125
Sony PlayStation 5,499,"Sony PlayStation 5 Gaming Console, 1TB",9012,1004,126
Here is the Python file:
import psycopg2
import age
import csv
def read_csv(csv_file):
with open(csv_file, 'r') as file:
reader = csv.reader(file)
rows = list(reader)
return rows
def create_csv(csv_file):
new_header = ['id', 'product_name', 'description', 'price', 'store_id', 'user_id']
property_order = [5, 0, 2, 1, 3, 4] # Reorder the properties accordingly.
rows = read_csv(csv_file)
new_csv_file = 'products.csv'
with open(new_csv_file, 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(new_header)
# Write each row with reordered properties.
for row in rows[1:]:
new_row = [row[i] for i in property_order]
writer.writerow(new_row)
print(f"New CSV file '{new_csv_file}' has been created with the desired format.")
def load_csv_nodes(csv_file, graph_name, conn):
with conn.cursor() as cursor:
try :
cursor.execute("""LOAD 'age';""")
cursor.execute("""SET search_path = ag_catalog, "$user", public;""")
cursor.execute("""SELECT load_labels_from_file(%s, 'Node', %s)""", (graph_name, csv_file,) )
conn.commit()
except Exception as ex:
print(type(ex), ex)
conn.rollback()
def main():
csv_file = 'ProductsData.csv'
create_csv(csv_file)
new_csv_file = 'products.csv'
GRAPH_NAME = 'csv_test_graph'
conn = psycopg2.connect(host="localhost", port="5432", dbname="database", user="user", password="password")
age.setUpAge(conn, GRAPH_NAME)
path_to_csv = '/path/to/folder/' + new_csv_file
load_csv_nodes(path_to_csv, GRAPH_NAME, conn)
main()
The generated file:
id,product_name,description,price,store_id,user_id
123,iPhone 12,"Apple iPhone 12 - 64GB, Space Gray",999,1234,1001
124,Samsung Galaxy S21,"Samsung Galaxy S21 - 128GB, Phantom Black",899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,"Sony PlayStation 5 Gaming Console, 1TB",499,9012,1004
But then, when running the script, it shows the following message:
<class 'psycopg2.errors.InvalidParameterValue'> label_id must be 1 .. 65535
The ids are set between 1 and 65535, and I don't understand why this error message is showing.
答案1
得分: 1
关于如何使用load_labels_from_file
,请参考regress测试文件。它展示了如何使用所有的命令。
在调用load_labels_from_file
之前,您首先需要创建Node
vlabel,使用以下命令:
SELECT create_vlabel('csv_test_graph','Node');
然后按原样运行脚本。
英文:
For how to use load_labels_from_file
please refer to the regress testing file. It shows how to use all the commands.
You first need to create Node
vlabel before calling load_labels_from_file
using the following command:
SELECT create_vlabel('csv_test_graph','Node');
Then run the script as it is.
答案2
得分: 0
这行代码没有正确编写,您需要使用正确的路径来修复它:
path_to_csv = '/正确的路径/' + new_csv_file
英文:
That's line is not properly written, you need to fix it with the correct path
path_to_csv = '/path/to/folder/' + new_csv_file
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论