Choosing the right database index type

Indexes give the database server a much faster way to lookup and retrieve specific rows. Postgres offers several different types of indexes, each of which performs best for certain types of WHERE clauses.

Standard comparisons (B-Tree Index)

Postgres will default to using a B-Tree for an index that doesn’t specify the index type.

--This index is created using a B-Tree Index
CREATE INDEX email_idx ON users (email);

Most of our day to day querying relies on comparison operators and the Postgres default works great for these. Comparison operators include:

  • =, <>, !=
  • <, >, <=, >=
  • DISTINCT, NOT DISTINCT
  • IS NULL, IS NOT NULL

If you’re creating an index for a column expected to make heavy use of these operators, you’re all set. Thanks, Postgres!

Inclusion (B-Tree Index)

As with standard comparisons, the default B-Tree index type works great for WHERE clauses checking for inclusion with INCLUDES or BETWEEN ... AND ....

Pattern matching with LIKE/ILIKE (GIN Index)

A lot of web apps have a search feature and a lot of search features get implemented using LIKE or its case-insensitive sibling ILIKE. In most of these queries you’re likely using an “unanchored” expression where both the right and left edges of the search string are variable.

--This LIKE expression is "unanchored"
SELECT * FROM films WHERE title LIKE '%query%';

--Compared to this one, which is "left anchored"
SELECT * FROM films WHERE title LIKE 'query%';

--Or to this one, which is "right anchored"
SELECT * FROM films WHERE title LIKE '%query';

If you’re using right anchored or unanchored LIKE/ILIKE expressions you’ll benefit greatly from a GIN index that uses a trigram operator class. This requires a bit more work than creating a run of the mill index.

First, you’ll need to include the trigram module (pg_trgm) in your database if it isn’t already. Then you can create a GIN index for your column that specifies the trigram operator classes from that module.

--Install the pg_trgm module if it isn't already
CREATE EXTENSION IF NOT EXISTS pg_trgm;

--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);

If you’re certain you’re only using left anchored expressions, e.g. LIKE 'query%', then you should be all right sticking with the B-Tree defaults so long as you’re running your database with the C locale.

Pattern matching using SIMILAR TO, NOT SIMILAR TO, ~, ~* and regex friends (GIN Index)

Using regular expressions to find matches? Everything that applies to the LIKE/ILIKE queries applies to you, too. If your regular expressions are left anchored you can use the defaults, but if you’re using unanchored expressions, opt for a GIN index.

--These are left anchored expression
SELECT * FROM films WHERE title SIMILAR TO 'query%';
SELECT * FROM films WHERE title ~ '^query';

--While these are unanchored expressions
SELECT * FROM films WHERE title SIMILAR TO '%query%';
SELECT * FROM films WHERE title ~ 'query';

--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);

Similarity using similarity() and its operators including %, <%, %> and others (GIN)

Where SIMILAR TO and other regex clauses return either true or false, the similarity() function provided by the pg_trgm module allows for a more qualified assessment of matches. This function will tell you how similar two strings are on a scale that goes from 0.0 (totally dissimilar) to 1.0 (completely similar). The similarity() function uses trigrams to determine these ratings and benefits dramatically from adding a GIN index. As with LIKE and our regex friends, your GIN index should specify the gin_trgm_ops operator class.

--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);

What’s next

The suggestions above will handle the majority of scenarios you will see in the wild. For niche situations, Postgres also offers several other index types such as BRIN and HASH. There are also some interesting edge-cases where a GiST can outperform GIN. The Postgres index docs are a great place to continue your optimization journey.