Your previews will never be the same: Create database branches with anonymized PII in seconds.

The pgvector extension

Use the pgvector for vector similarity search in Postgres

The pgvector extension enables you to store vector embeddings and perform vector similarity search in Postgres. It is particularly useful for applications involving natural language processing, such as those built on top of OpenAI's GPT models.

pgvector supports:

  • Exact and approximate nearest neighbor search
  • Single-precision, half-precision, binary, and sparse vectors
  • L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
  • Any language with a Postgres client
  • ACID compliance, point-in-time recovery, JOINs, and all other features of Postgres

This topic describes how to enable the pgvector extension in Neon and how to create, store, and query vectors.

Try it on Neon!

Neon is Serverless Postgres built for the cloud. Explore Postgres features and functions in our user-friendly SQL editor. Sign up for a free account to get started.

Sign Up

Enable the pgvector extension

You can enable the pgvector extension by running the following CREATE EXTENSION statement in the Neon SQL Editor or from a client such as psql that is connected to Neon.

CREATE EXTENSION vector;

For information about using the Neon SQL Editor, see Query with Neon's SQL Editor. For information about using the psql client with Neon, see Connect with psql.

Create a table to store vectors

To create a table for storing vectors, use the following SQL command, adjusting the dimensions as needed.

CREATE TABLE items (
  id BIGSERIAL PRIMARY KEY,
  embedding VECTOR(3)
);

The command generates a table named items with an embedding column capable of storing vectors with 3 dimensions. OpenAI's text-embedding-ada-002 model supports 1536 dimensions for each piece of text, which creates more accurate embeddings for natural language processing tasks. For more information about embeddings, see Embeddings, in the OpenAI documentation.

Storing vectors and embeddings

After you have generated an embedding using a service like the OpenAI API, you can store the resulting vector in your database. Using a Postgres client library in your preferred programming language, you can execute an INSERT statement similar to the following to store embeddings.

This command inserts two new rows into the items table with the provided embeddings.

INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

This command loads vectors in bulk using the COPY command:

COPY items (embedding) FROM STDIN WITH (FORMAT BINARY);

For a Python script example, see bulk_loading.py.

This command how how to upserts vectors:

INSERT INTO items (id, embedding) VALUES (1, '[1,2,3]'), (2, '[4,5,6]')
    ON CONFLICT (id) DO UPDATE SET embedding = EXCLUDED.embedding;

This command shows how to update vectors:

UPDATE items SET embedding = '[1,2,3]' WHERE id = 1;

This command shows how to delete vectors:

DELETE FROM items WHERE id = 1;

Querying vectors

To retrieve vectors and calculate similarity, use SELECT statements and the built-in vector operators. For instance, you can find the top 5 most similar items to a given embedding using the following query:

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

This query computes the Euclidean distance (L2 distance) between the given vector and the vectors stored in the items table, sorts the results by the calculated distance, and returns the top 5 most similar items.

Supported distance functions include:

  • <-> - L2 distance
  • <#> - (negative) inner product
  • <=> - cosine distance
  • <+> - L1 distance (added in 0.7.0)

note

The inner product operator (<#>) returns the negative inner product since Postgres only supports ASC order index scans on operators.

Get the nearest neighbors to a row:

SELECT * FROM items WHERE id != 1 ORDER BY embedding <-> (SELECT embedding FROM items WHERE id = 1) LIMIT 5;

Get rows within a certain distance:

SELECT * FROM items WHERE embedding <-> '[3,1,2]' < 5;```

note

Combine with ORDER BY and LIMIT to use an index

Get the distance:

SELECT embedding <-> '[3,1,2]' AS distance FROM items;

For inner product, multiply by -1 (since <#> returns the negative inner product):

SELECT (embedding <#> '[3,1,2]') * -1 AS inner_product FROM items;

For cosine similarity, use 1 - cosine distance:

SELECT 1 - (embedding <=> '[3,1,2]') AS cosine_similarity FROM items;

To average vectors:

SELECT AVG(embedding) FROM items;

To average groups of vectors:

SELECT category_id, AVG(embedding) FROM items GROUP BY category_id;

Indexing vectors

By default, pgvector performs exact nearest neighbor search, providing perfect recall. Adding an index on the vector column can improve query performance with a minor cost in recall.

Supported index types include:

HNSW

An HNSW index creates a multilayer graph. It has better query performance than IVFFlat (in terms of speed-recall tradeoff), but has slower build times and uses more memory. Also, an index can be created without any data in the table since there isn’t a training step like IVFFlat.

The following examples show how to add an HNSW index for the supported distance functions. The supported types include:

  • vector - up to 2,000 dimensions
  • halfvec - up to 4,000 dimensions (added in 0.7.0)
  • bit - up to 64,000 dimensions (added in 0.7.0)
  • sparsevec - up to 1,000 non-zero elements (added in 0.7.0)

L2 distance

CREATE INDEX ON items USING hnsw (embedding vector_l2_ops);

note

Use halfvec_l2_ops for halfvec and sparsevec_l2_ops for sparsevec (and similar for the other distance functions).

Inner product

CREATE INDEX ON items USING hnsw (embedding vector_ip_ops);

Cosine distance

CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops);

L1 distance - added in 0.7.0

CREATE INDEX ON items USING hnsw (embedding vector_l1_ops);

Hamming distance - added in 0.7.0

CREATE INDEX ON items USING hnsw (embedding bit_hamming_ops);

Jaccard distance - added in 0.7.0

CREATE INDEX ON items USING hnsw (embedding bit_jaccard_ops);

HNSW index options

  • m - the max number of connections per layer (16 by default)
  • ef_construction - the size of the dynamic candidate list for constructing the graph (64 by default)

This example demonstrates how to set the parameters:

CREATE INDEX ON items USING hnsw (embedding vector_l2_ops) WITH (m = 16, ef_construction = 64);

A higher value of ef_construction provides better recall at the cost of index build time and insert speed.

HNSW query options

You can specify the size of the dynamic candidate list for search. The size is 40 by default.

SET hnsw.ef_search = 100;

A higher value provides better recall at the cost of speed.

This query shows how to use SET LOCAL inside a transaction to set ef_search for a single query:

BEGIN;
SET LOCAL hnsw.ef_search = 100;
SELECT ...
COMMIT;

IVFFlat

An IVFFlat index divides vectors into lists and searches a subset of those lists that are closest to the query vector. It has faster build times and uses less memory than HNSW, but has lower query performance with respect to the speed-recall tradeoff.

Keys to achieving good recall include:

  • Creating the index after the table has some data
  • Choosing an appropriate number of lists. A good starting point is rows/1000 for up to 1M rows and sqrt(rows) for over 1M rows.
  • Specify an appropriate number of probes when querying. A higher number is better for recall, and a lower is better for speed. A good starting point is sqrt(lists).

Supported types include:

  • vector - up to 2,000 dimensions
  • halfvec - up to 4,000 dimensions (added in 0.7.0)
  • bit - up to 64,000 dimensions (added in 0.7.0)

The following examples show how to add an index for each distance function:

L2 distance

CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);

note

Use halfvec_l2_ops for halfvec (and similar with the other distance functions).

Inner product

CREATE INDEX ON items USING ivfflat (embedding vector_ip_ops) WITH (lists = 100);

Cosine distance

CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Hamming distance - added in 0.7.0

CREATE INDEX ON items USING ivfflat (embedding bit_hamming_ops) WITH (lists = 100);

IVFFlat query options

You can specify the number of probes, which is 1 by default.

SET ivfflat.probes = 10;

A higher value provides better recall at the cost of speed. You can set the value to the number of lists for exact nearest neighbor search, at which point the planner won’t use the index.

You can also use SET LOCAL inside a transaction to set the number of probes for a single query:

BEGIN;
SET LOCAL ivfflat.probes = 10;
SELECT ...
COMMIT;

Optimizing index build time

To optimize index build time, consider configuring the following session variables prior to building an index:

maintenance_work_mem

In Postgres, the maintenance_work_mem setting determines the maximum memory allocation for tasks such as CREATE INDEX. The default maintenance_work_mem value in Neon is set according to your Neon compute size:

Compute Units (CU)vCPURAMmaintenance_work_mem
0.250.251 GB64 MB
0.500.502 GB64 MB
114 GB67 MB
228 GB134 MB
3312 GB201 MB
4416 GB268 MB
5520 GB335 MB
6624 GB402 MB
7728 GB470 MB
8832 GB537 MB

To optimize pgvector index build time, you can increase the maintenance_work_mem setting for the current session with a command similar to the following:

SET maintenance_work_mem='10 GB';

The recommended setting is your working set size (the size of your tuples for vector index creation). However, your maintenance_work_mem setting should not exceed 50 to 60 percent of your compute's available RAM (see the table above). For example, the maintenance_work_mem='10 GB' setting shown above has been successfully tested on a 7 CU compute, which has 28 GB of RAM, as 10 GiB is less than 50% of the RAM available for that compute size.

max_parallel_maintenance_workers

The max_parallel_maintenance_workers sets the maximum number of parallel workers that can be started by a single utility command such as CREATE INDEX. By default, the max_parallel_maintenance_workers setting is 2. For efficient parallel index creation, you can increase this setting. Parallel workers are taken from the pool of processes established by max_worker_processes (10), limited by max_parallel_workers (8).

You can increase the maintenance_work_mem setting for the current session with a command similar to the following:

SET max_parallel_maintenance_workers = 7

For example, if you have a 7 CU compute size, you could set max_parallel_maintenance_workers to 7, before index creation, to make use of all of the vCPUs available.

Differences in behaviour between pgvector 0.5.1 and 0.7.0

Differences in behavior in the following corner cases were found during our testing of pgvector 0.7.0:

Distance between a valid and NULL vector

The distance between a valid and NULL vector (NULL::vector) with pgvector 0.7.0 differs from pgvector 0.5.1 when using an HNSW or IVFFLAT index, as shown in the following examples:

HNSW

For the following script, comparing the NULL::vector to non-null vectors the resulting output changes:

SET enable_seqscan = off;

CREATE TABLE t (val vector(3));
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
CREATE INDEX ON t USING hnsw (val vector_l2_ops);

INSERT INTO t (val) VALUES ('[1,2,4]');

SELECT * FROM t ORDER BY val <-> (SELECT NULL::vector);

pgvector 0.7.0 output:

val   
---------
 [1,1,1]
 [1,2,4]
 [1,2,3]
 [0,0,0]

pgvector 0.5.1 output:

val   
---------
 [0,0,0]
 [1,1,1]
 [1,2,3]
 [1,2,4]

IVFFLAT

For the following script, comparing the NULL::vector to non-null vectors the resulting output changes:

SET enable_seqscan = off;

CREATE TABLE t (val vector(3));
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
CREATE INDEX ON t USING ivfflat (val vector_l2_ops) WITH (lists = 1);

INSERT INTO t (val) VALUES ('[1,2,4]');

SELECT * FROM t ORDER BY val <-> (SELECT NULL::vector);

pgvector 0.7.0 output:

val   
---------
 [0,0,0]
 [1,2,3]
 [1,1,1]
 [1,2,4]

pgvector 0.5.1 output:

val   
---------
[0,0,0]
[1,1,1]
[1,2,3]
[1,2,4]

Error messages improvement for invalid literals

If you use an invalid literal value for the vector data type, you will now see the following error message:

SELECT '[4e38,1]'::vector;
ERROR:  "4e38" is out of range for type vector
LINE 1: SELECT '[4e38,1]'::vector;

Resources

pgvector source code: https://github.com/pgvector/pgvector

Need help?

Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more detail, see Getting Support.

Last updated on

Edit this page
Was this page helpful?