2023年5月11日 00:28:47go评论116阅读模式

英文:

Type "vector" does not exist on postgresql - langchain

问题

I was trying to embed some documents on postgresql with the help of pgvector extension and langchain. Unfortunately I'm having trouble with the following error:

(psycopg2.errors.UndefinedObject) type "vector" does not exist
LINE 4:  embedding VECTOR(1536), 
                   ^
[SQL: 
CREATE TABLE langchain_pg_embedding (
	collection_id UUID, 
	embedding VECTOR(1536), 
	document VARCHAR, 
	cmetadata JSON, 
	custom_id VARCHAR, 
	uuid UUID NOT NULL, 
	PRIMARY KEY (uuid), 
	FOREIGN KEY(collection_id) REFERENCES langchain_pg_collection (uuid) ON DELETE CASCADE
)
]

My environment info:

pgvector docker image ankane/pgvector:v0.4.1
python 3.10.6, psycopg2 2.9.6, pgvector 0.1.6

List of installed extensions on postgres

  Name   | Version |   Schema   |                Description                 
---------+---------+------------+--------------------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
 vector  | 0.4.1   | public     | vector data type and ivfflat access method

I've tried the following ways to resolve:

Fresh installing the Postgres docker image with pgvector extension enabled.
Manually install the extension with the official instruction.
Manually install the extension on Postgres like the following:

CREATE EXTENSION IF NOT EXISTS vector
    SCHEMA public
    VERSION "0.4.1";

But no luck.

英文:

I was trying to embed some documents on postgresql with the help of pgvector extension and langchain. Unfortunately I'm having trouble with the following error:

(psycopg2.errors.UndefinedObject) type &quot;vector&quot; does not exist
LINE 4:  embedding VECTOR(1536), 
                   ^
[SQL: 
CREATE TABLE langchain_pg_embedding (
	collection_id UUID, 
	embedding VECTOR(1536), 
	document VARCHAR, 
	cmetadata JSON, 
	custom_id VARCHAR, 
	uuid UUID NOT NULL, 
	PRIMARY KEY (uuid), 
	FOREIGN KEY(collection_id) REFERENCES langchain_pg_collection (uuid) ON DELETE CASCADE
)
]

My environment info:

pgvector docker image ankane/pgvector:v0.4.1
python 3.10.6, psycopg2 2.9.6, pgvector 0.1.6

List of installed extensions on postgres

  Name   | Version |   Schema   |                Description                 
---------+---------+------------+--------------------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
 vector  | 0.4.1   | public     | vector data type and ivfflat access method

I've tried the following ways to resolve:

Fresh installing the Postgres docker image with pgvector extension enabled.
Manually install the extension with the official instruction.
Manually install the extension on Postgres like the following:

CREATE EXTENSION IF NOT EXISTS vector
    SCHEMA public
    VERSION &quot;0.4.1&quot;;

But no luck.

答案1

得分: 2

更新于2023年7月17日

如之前提到的，我的问题不在配置中，以下是可能导致错误的另一个原因：

数据库中未启用 pgvector 扩展。请确保在用于存储向量的每个数据库中运行 CREATE EXTENSION vector;。
向量模式未包含在 search_path 中。运行 SHOW search_path; 查看搜索路径中可用的模式，运行 \dx 查看已安装扩展和模式的列表。

不幸的是，问题出在其他地方。我的 扩展安装 和 search_path 模式对于我应该使用的指定数据库是完全正确的。但是，负责使用哪个数据库的环境变量混乱了，它使用了默认数据库 postgres，而不是我指定的数据库，后者未启用该扩展。

英文:

Update 17th July 2023

As previously I mentioned my issue was somewhere else in my configuration, here is the other reason that may be responsible for the error,

The pgvector extension isn't enabled in the database you are using. Make sure you run CREATE EXTENSION vector; in each database you are using for storing vectors.
The vector schema is not in the search_path. Run SHOW search_path; to see the available schemas in the search path and \dx to see the list of installed extensions with schemas.

Unfortunately, the issue was somewhere else. My extension installation and search_path schema were totally okay for the defined database I was supposed to use. But my environment variable which was responsible for which database to use, got messed up and was using the default database postgres instead of my defined database, which didn't have the extension enabled.

答案2

得分: 0

我也遇到过这样的问题，当我直接使用psycopg2连接到数据库并执行以下SQL语句时：

cur.execute('''
CREATE TABLE langchain_pg_embedding (
    uuid UUID NOT NULL,
    collection_id UUID,
    embedding VECTOR,
    document VARCHAR,
    cmetadata JSON,
    custom_id VARCHAR,
    PRIMARY KEY (uuid))
''')

成功执行这个数据库语句没有问题。然而，当我使用langchain时，遇到一个错误，提示数据类型不存在。

刚刚，我通过为数据库设置永久搜索路径来解决了这个问题。

ALTER DATABASE postgres SET SEARCH_PATH TO postgres_schema;

在这里，“postgres”是当前数据库的名称。
“postgres_schema”代表要设置为搜索路径的模式。
以上命令将永久更改数据库级别的模式搜索路径。

英文:

I have also encountered such an issue when I directly use psycopg2 to connect to the database and execute the following SQL statement:

cur.execute(&#39;&#39;&#39;
CREATE TABLE langchain_pg_embedding (
    uuid UUID NOT NULL,
    collection_id UUID,
    embedding VECTOR,
    document VARCHAR,
    cmetadata JSON,
    custom_id VARCHAR,
    PRIMARY KEY (uuid))
&#39;&#39;&#39;)

There is no issue executing this database statement successfully. However, when I use langchain, I encounter an error stating that the data type does not exist.

Just now, I resolved this issue by setting a permanent search path for the database.

ALTER DATABASE postgres SET SEARCH_PATH TO postgres_schema;

Here, “postgres” is the name of the current database.
The “postgres_schema” represents the schema to be set as the search path.
The above command will change the schema search path at the database level, permanently.

答案3

得分: 0

我通过以下步骤解决了问题：

cd /tmp
git clone --branch v0.4.4 https://github.com/pgvector/pgvector.git
cd pgvector 
make
sudo make install 
CREATE EXTENSION vector;

英文:

I resolved the issue by follow the following steps:

cd /tmp
git clone --branch v0.4.4 https://github.com/pgvector/pgvector.git
cd pgvector 
make
sudo make install 
CREATE EXTENSION vector;

答案4

得分: 0

Langchain使用两个表，只有一个使用VECTOR。在配置应用程序时，如果一个模式中创建了一个表，而另一个模式中创建了另一个表，也会导致此错误。

只需从您的模式和公共区域中删除（移动）langchain表，然后在设置稳定后重新尝试启动应用程序。然后表应该能正确创建。

langchain_pg_collection - 普通表
langchain_pg_embedding - 具有矢量列，第二个创建并具有对langchain_pg_embedding_collection_id_fkey的外键

英文:

Langchain uses two tables and only one uses VECTOR. While configuring the application, if one table gets created in one schema and the other is getting created in another schema, that will cause this error as well.

Just delete(move ) the langchain tables from your schema and public and then retry starting the application again after the settings stabilized. Then the tables should be created correctly.

langchain_pg_collection - plain table
langchain_pg_embedding - has a vector column, is created second and has a foreign key to langchain_pg_embedding_collection_id_fkey

答案5

得分: 0

我通过在构建 Docker 容器时运行 init.sql 并使用 create extension 解决了类似的问题。

Docker Compose 的一部分如下所示：

volumes:
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

init.sql 内容如下：

CREATE EXTENSION vector;

英文:

I solved a similar problem by running init.sql with create extension when building the docker container.

The snippet of the docker-compose

volumes:
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

init.sql

CREATE EXTENSION vector;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Type “vector” 在postgresql – langchain 上不存在

问题

答案1

更新于2023年7月17日

Update 17th July 2023

答案2

答案3

答案4

答案5

将子进程的输出流传输到2个或更多客户端。

如何解决此错误：AttributeError: ‘Polygon’对象没有属性’colour’。

在 Pydantic 的 JSON 方法中，字符编码不正确（Python）

BeautifulSoup 属性错误：’NoneType’ 对象没有 ‘text’ 属性，在网页抓取尝试中。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论