2023年6月25日 23:45:18go评论117阅读模式

英文:

Loading nodes CSV to AGE with provided IDs returns "label_id must be 1 ... 65535"

问题

我有一个CSV文件，它的格式不符合AGE加载的正确方式。我的任务是将其转换为一个新文件，以便AGE能够读取它并创建节点，就像文档中指定的那样。为此，我创建了一个Python脚本，它创建一个新文件，连接到PostgreSQL，并执行查询。我认为这可能很有用，因为如果有人有CSV文件，并希望创建节点和边并将其发送到AGE，但文件不符合指定的格式，这个脚本可以快速解决这个问题。

以下是旧的CSV文件（ProductsData.csv），其中包含其他用户购买的产品数据（通过其user_id标识），购买产品的商店（通过其store_id标识），以及product_id，它是节点的id：

product_name,price,description,store_id,user_id,product_id
iPhone 12,999,"Apple iPhone 12 - 64GB, Space Gray",1234,1001,123
Samsung Galaxy S21,899,"Samsung Galaxy S21 - 128GB, Phantom Black",5678,1002,124
AirPods Pro,249,"Apple AirPods Pro with Active Noise Cancellation",1234,1003,125
Sony PlayStation 5,499,"Sony PlayStation 5 Gaming Console, 1TB",9012,1004,126

以下是Python文件：

import psycopg2
import age
import csv
def read_csv(csv_file):
    with open(csv_file, 'r') as file:
        reader = csv.reader(file)
        rows = list(reader)
    return rows
def create_csv(csv_file):
    new_header = ['id', 'product_name', 'description', 'price', 'store_id', 'user_id']
    property_order = [5, 0, 2, 1, 3, 4]  # 重新排序属性。
    
    rows = read_csv(csv_file)
    
    new_csv_file = 'products.csv'
    with open(new_csv_file, 'w', newline='') as file:
        writer = csv.writer(file)
        
        writer.writerow(new_header)
        
        # 用重新排序的属性写入每一行。
        for row in rows[1:]:
            new_row = [row[i] for i in property_order]
            writer.writerow(new_row)
    print(f"已创建具有所需格式的新CSV文件 '{new_csv_file}'。")
def load_csv_nodes(csv_file, graph_name, conn):
    
    with conn.cursor() as cursor:
        try :
            cursor.execute("LOAD 'age';")
            cursor.execute("SET search_path = ag_catalog, '$user', public;")
            cursor.execute("SELECT load_labels_from_file(%s, 'Node', %s)", (graph_name, csv_file,))
            conn.commit()
        
        except Exception as ex:
            print(type(ex), ex)
            conn.rollback()
def main():
    csv_file = 'ProductsData.csv'
    create_csv(csv_file)
    new_csv_file = 'products.csv'
    GRAPH_NAME = 'csv_test_graph'
    conn = psycopg2.connect(host="localhost", port="5432", dbname="database", user="user", password="password")
    age.setUpAge(conn, GRAPH_NAME)
    path_to_csv = '/path/to/folder/' + new_csv_file
    load_csv_nodes(path_to_csv, GRAPH_NAME, conn)
main()

生成的文件：

id,product_name,description,price,store_id,user_id
123,iPhone 12,"Apple iPhone 12 - 64GB, Space Gray",999,1234,1001
124,Samsung Galaxy S21,"Samsung Galaxy S21 - 128GB, Phantom Black",899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,"Sony PlayStation 5 Gaming Console, 1TB",499,9012,1004

但是，当运行脚本时，它显示以下消息：

<class 'psycopg2.errors.InvalidParameterValue'> label_id must be 1 .. 65535

这些ID被设置在1到65535之间，我不明白为什么会显示这个错误消息。

英文:

I have a csv file that is not formatted in the correct way for AGE to load. I was on the task to transform it into a new one so that AGE could read it and create nodes, like it is specified in the documentation. For that, I created a python script that creates a new file, connects to postgres, and performs the queries. I though this could be useful since if someone had csv files and wanted to create nodes and edges and send it to AGE, but it was not in the specified format, this could be used to quickly solve the problem.

Here is the old csv file (ProductsData.csv), it contains the data of products that have been purchased by other users (identified by their user_id), the store where the product was purchased from (identified by their store_id), and also the product_id, which is the id of the node:

product_name,price,description,store_id,user_id,product_id
iPhone 12,999,&quot;Apple iPhone 12 - 64GB, Space Gray&quot;,1234,1001,123
Samsung Galaxy S21,899,&quot;Samsung Galaxy S21 - 128GB, Phantom Black&quot;,5678,1002,124
AirPods Pro,249,&quot;Apple AirPods Pro with Active Noise Cancellation&quot;,1234,1003,125
Sony PlayStation 5,499,&quot;Sony PlayStation 5 Gaming Console, 1TB&quot;,9012,1004,126

Here is the Python file:

import psycopg2
import age
import csv
def read_csv(csv_file):
    with open(csv_file, &#39;r&#39;) as file:
        reader = csv.reader(file)
        rows = list(reader)
    return rows
def create_csv(csv_file):
    new_header = [&#39;id&#39;, &#39;product_name&#39;, &#39;description&#39;, &#39;price&#39;, &#39;store_id&#39;, &#39;user_id&#39;]
    property_order = [5, 0, 2, 1, 3, 4]  # Reorder the properties accordingly.
    
    rows = read_csv(csv_file)
    
    new_csv_file = &#39;products.csv&#39;
    with open(new_csv_file, &#39;w&#39;, newline=&#39;&#39;) as file:
        writer = csv.writer(file)
        
        writer.writerow(new_header)
        
        # Write each row with reordered properties.
        for row in rows[1:]:
            new_row = [row[i] for i in property_order]
            writer.writerow(new_row)
    print(f&quot;New CSV file &#39;{new_csv_file}&#39; has been created with the desired format.&quot;)
def load_csv_nodes(csv_file, graph_name, conn):
    
    with conn.cursor() as cursor:
        try :
            cursor.execute(&quot;&quot;&quot;LOAD &#39;age&#39;;&quot;&quot;&quot;)
            cursor.execute(&quot;&quot;&quot;SET search_path = ag_catalog, &quot;$user&quot;, public;&quot;&quot;&quot;)
            cursor.execute(&quot;&quot;&quot;SELECT load_labels_from_file(%s, &#39;Node&#39;, %s)&quot;&quot;&quot;, (graph_name, csv_file,) )
            conn.commit()
        
        except Exception as ex:
            print(type(ex), ex)
            conn.rollback()
def main():
    csv_file = &#39;ProductsData.csv&#39;
    create_csv(csv_file)
    new_csv_file = &#39;products.csv&#39;
    GRAPH_NAME = &#39;csv_test_graph&#39;
    conn = psycopg2.connect(host=&quot;localhost&quot;, port=&quot;5432&quot;, dbname=&quot;database&quot;, user=&quot;user&quot;, password=&quot;password&quot;)
    age.setUpAge(conn, GRAPH_NAME)
    path_to_csv = &#39;/path/to/folder/&#39; + new_csv_file
    load_csv_nodes(path_to_csv, GRAPH_NAME, conn)
main()

The generated file:

id,product_name,description,price,store_id,user_id
123,iPhone 12,&quot;Apple iPhone 12 - 64GB, Space Gray&quot;,999,1234,1001
124,Samsung Galaxy S21,&quot;Samsung Galaxy S21 - 128GB, Phantom Black&quot;,899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,&quot;Sony PlayStation 5 Gaming Console, 1TB&quot;,499,9012,1004

But then, when running the script, it shows the following message:

&lt;class &#39;psycopg2.errors.InvalidParameterValue&#39;&gt; label_id must be 1 .. 65535

The ids are set between 1 and 65535, and I don't understand why this error message is showing.

答案1

得分: 1

关于如何使用load_labels_from_file，请参考regress测试文件。它展示了如何使用所有的命令。

在调用load_labels_from_file之前，您首先需要创建Node vlabel，使用以下命令：

SELECT create_vlabel('csv_test_graph','Node');

然后按原样运行脚本。

英文:

For how to use load_labels_from_file please refer to the regress testing file. It shows how to use all the commands.

You first need to create Node vlabel before calling load_labels_from_file using the following command:

SELECT create_vlabel(&#39;csv_test_graph&#39;,&#39;Node&#39;);

Then run the script as it is.

答案2

得分: 0

这行代码没有正确编写，您需要使用正确的路径来修复它：

path_to_csv = '/正确的路径/' + new_csv_file

英文:

That's line is not properly written, you need to fix it with the correct path

    path_to_csv = &#39;/path/to/folder/&#39; + new_csv_file

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将CSV中的节点加载到AGE中并使用提供的ID返回“label_id必须为1 … 65535”。

问题

答案1

答案2

启用 PostgreSQL 15.2 中的 SQL 调试模式

PostgreSQL OR 运算符 – 仅在第一个条件没有匹配时检查第二个条件。

基于DataFrame的热力图

提取列表中的元组元素。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。