Skip to content
pg_textsearch

pg_textsearch

pg_textsearch : Full-text search with BM25 ranking

Overview

IDExtensionPackageVersionCategoryLicenseLanguage
2180
pg_textsearch
pg_textsearch
1.2.0
FTS
PostgreSQL
C
AttributeHas BinaryHas LibraryNeed LoadHas DDLRelocatableTrusted
--sLd--
No
Yes
Yes
Yes
no
no
Relationships
See Also
pg_search
pgroonga
pg_bigm
zhparser
pg_trgm
rum
biscuit
fuzzystrmatch

bm25 am conflicts with pg_search and vchord_bm25

Packages

TypeRepoVersionPG Major CompatibilityPackage PatternDependencies
EXT
PIGSTY
1.2.0
18
17
16
15
14
pg_textsearch-
RPM
PIGSTY
1.2.0
18
17
16
15
14
pg_textsearch_$v-
DEB
PIGSTY
1.2.0
18
17
16
15
14
postgresql-$v-textsearch-
Linux / PGPG18PG17PG16PG15PG14
el8.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
el8.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
el9.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
el9.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
el10.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
el10.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
d12.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
d12.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
d13.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
d13.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u22.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u22.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u24.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u24.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u26.x86_64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
u26.aarch64
PIGSTY 1.2.0
PIGSTY 1.2.0
MISS
MISS
MISS
PackageVersionOSORGSIZEFile URL
pg_textsearch_181.2.0el8.x86_64pigsty127.5 KiBpg_textsearch_18-1.2.0-1PIGSTY.el8.x86_64.rpm
pg_textsearch_181.2.0el8.aarch64pigsty121.9 KiBpg_textsearch_18-1.2.0-1PIGSTY.el8.aarch64.rpm
pg_textsearch_181.2.0el9.x86_64pigsty118.6 KiBpg_textsearch_18-1.2.0-1PIGSTY.el9.x86_64.rpm
pg_textsearch_181.2.0el9.aarch64pigsty115.7 KiBpg_textsearch_18-1.2.0-1PIGSTY.el9.aarch64.rpm
pg_textsearch_181.2.0el10.x86_64pigsty121.6 KiBpg_textsearch_18-1.2.0-1PIGSTY.el10.x86_64.rpm
pg_textsearch_181.2.0el10.aarch64pigsty117.4 KiBpg_textsearch_18-1.2.0-1PIGSTY.el10.aarch64.rpm
postgresql-18-textsearch1.2.0d12.x86_64pigsty1.0 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~bookworm_amd64.deb
postgresql-18-textsearch1.2.0d12.aarch64pigsty1.0 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~bookworm_arm64.deb
postgresql-18-textsearch1.2.0d13.x86_64pigsty1.0 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~trixie_amd64.deb
postgresql-18-textsearch1.2.0d13.aarch64pigsty1.0 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~trixie_arm64.deb
postgresql-18-textsearch1.2.0u22.x86_64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~jammy_amd64.deb
postgresql-18-textsearch1.2.0u22.aarch64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~jammy_arm64.deb
postgresql-18-textsearch1.2.0u24.x86_64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~noble_amd64.deb
postgresql-18-textsearch1.2.0u24.aarch64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~noble_arm64.deb
postgresql-18-textsearch1.2.0u26.x86_64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~resolute_amd64.deb
postgresql-18-textsearch1.2.0u26.aarch64pigsty1.1 MiBpostgresql-18-textsearch_1.2.0-1PIGSTY~resolute_arm64.deb

Source

pig build pkg pg_textsearch;		# build rpm/deb

Install

Make sure PGDG and PIGSTY repo available:

pig repo add pgsql -u   # add both repo and update cache

Install this extension with pig:

pig install pg_textsearch;		# install via package name, for the active PG version

pig install pg_textsearch -v 18;   # install for PG 18
pig install pg_textsearch -v 17;   # install for PG 17

Config this extension to shared_preload_libraries:

shared_preload_libraries = 'pg_textsearch';

Create this extension with:

CREATE EXTENSION pg_textsearch;

Usage

Sources: README v1.2.0, v1.2.0 release notes

pg_textsearch provides BM25-ranked full-text search for PostgreSQL with a bm25 access method and the <@> scoring operator. Upstream marks v1.2.0 as production ready.

v1.2.0 supports PostgreSQL 17 and 18. Prebuilt release assets are published for both PostgreSQL versions on Linux and macOS. The extension must be loaded through shared_preload_libraries before CREATE EXTENSION.

Enable the Extension

shared_preload_libraries = 'pg_textsearch'  # add to any existing list
CREATE EXTENSION pg_textsearch;

Install the new binary and restart PostgreSQL before running an extension upgrade:

ALTER EXTENSION pg_textsearch UPDATE;

The v1.2.0 release adds physical replication support and correctness fixes for update-heavy workloads. Install the matching binary and run the SQL extension upgrade before relying on the new version.

Build and Query BM25 Indexes

CREATE TABLE documents (id bigserial PRIMARY KEY, content text);

CREATE INDEX docs_idx
ON documents USING bm25(content)
WITH (text_config = 'english');

SELECT *
FROM documents
ORDER BY content <@> 'database system'
LIMIT 5;

<@> returns the negative BM25 score because PostgreSQL operator index scans are ascending; lower values are better matches. Use ORDER BY ... LIMIT for fast top-k searches.

For an explicit index reference, use to_bm25query():

SELECT *
FROM documents
ORDER BY content <@> to_bm25query('database system', 'docs_idx')
LIMIT 5;

The main documented SQL surface is:

  • text <@> 'query' to score text with planner-detected index context.
  • text <@> bm25query to score with an explicit bm25query.
  • to_bm25query(text) for ORDER BY use with planner-selected index context.
  • to_bm25query(text, text) for query text plus index name.
  • bm25query = bm25query for equality checks.

Index Options and Data Shapes

CREATE INDEX ON documents USING bm25(content)
WITH (text_config = 'english', k1 = 1.5, b = 0.8);

Index options are text_config (required), k1 (default 1.2), and b (default 0.75). Text search configurations such as english, simple, french, and german use PostgreSQL text search configuration names.

The extension supports native array input for text[], varchar[], and bpchar[] columns; array elements are concatenated before tokenization.

CREATE TABLE posts (id serial PRIMARY KEY, tags text[]);
CREATE INDEX posts_tags_idx ON posts USING bm25(tags)
WITH (text_config = 'english');

SELECT *
FROM posts
ORDER BY tags <@> 'database'
LIMIT 10;

Expression indexes support immutable text expressions, including JSONB extraction, text transformations, and multi-column concatenation:

CREATE INDEX events_msg_idx ON events USING bm25 ((data->>'message'))
WITH (text_config = 'english');

SELECT *
FROM events
ORDER BY (data->>'message') <@> to_bm25query('network error', 'events_msg_idx')
LIMIT 10;

Partial indexes scope search to a subset of rows. Query them with an explicit index name:

CREATE INDEX docs_en_idx ON docs USING bm25(content)
WITH (text_config = 'english')
WHERE lang = 'en';

SELECT *
FROM docs
WHERE lang = 'en'
ORDER BY content <@> to_bm25query('databases', 'docs_en_idx')
LIMIT 10;

Operations and GUCs

SELECT bm25_force_merge('docs_idx');
SELECT * FROM bm25_memory_usage();

bm25_force_merge(index_name) consolidates all segments into one and is best used after bulk loads, not during steady write traffic. bm25_memory_usage() reports shared memory usage for memtables.

Documented pg_textsearch GUCs in v1.2.0 include:

  • pg_textsearch.default_limit
  • pg_textsearch.compress_segments
  • pg_textsearch.segments_per_level
  • pg_textsearch.memory_limit
  • pg_textsearch.bulk_load_threshold
  • pg_textsearch.memtable_spill_threshold (deprecated; use memory_limit for new deployments)

pg_textsearch.memory_limit defaults to 2GB and caps dynamic shared memory used by memtables. The README also documents bm25_spill_index(index_name), bm25_dump_index(index_name), and bm25_summarize_index(index_name) as development or diagnostic helpers.

Caveats

  • pg_textsearch requires shared_preload_libraries = 'pg_textsearch' and a PostgreSQL restart before CREATE EXTENSION.
  • The bm25 access method name conflicts with pg_search and vchord_bm25; avoid installing those BM25 access-method extensions into the same database.
  • Inside PL/pgSQL and stored procedures, the implicit text <@> 'query' form does not use planner hooks; upstream says to use to_bm25query() with an explicit index name there.
  • Phrase queries are not native because the index stores term frequencies, not term positions; use BM25 ranking plus a post-filter for phrase-like matching.
  • Partial indexes require to_bm25query() with the index name because the implicit query form skips them.
  • BM25 indexes on partitioned tables use partition-local statistics, so cross-partition scores may not be directly comparable.
  • Words longer than PostgreSQL’s tsvector word length limit are ignored during tokenization.
  • pg_textsearch uses fixed LWLock tranche IDs 1001-1008; another extension using the same fixed IDs can make wait-event names inaccurate.
Last updated on