Version: 1.0.16

ltree

mdx: format: md

This module implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are also provided.

This module is considered "trusted", meaning it can be installed by non-superusers who have CREATE privilege on the current database.

1. Definitions

A label is a sequence of alphanumeric characters and underscores (for example, in the C locale, the characters A-Za-z0-9_ are allowed).

Label length must be less than 256 characters.

Examples: 42, Personal_Services

A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path must not exceed 65535 labels.

Examples: Top.Countries.Europe.Russia

The ltree module provides several data types:

• ltree stores a label path.

• lquery represents a regular-expression-like pattern for matching ltree values. A simple word matches that label in a path. A star (*) matches zero or more labels. These can be joined with dots to form a pattern that must match the entire label path. For example:

foo matches exactly the label path foo

.foo. matches any label path containing the label foo

*.foo matches any label path whose last label is foo

Both stars and simple words can be quantified to restrict how many labels they can match:

*{n} matches exactly n labels

*{n,} matches at least n labels

*{n,m} matches at least n but not more than m labels

*{,m} matches at most m labels — same as *{0,m}

foo{n,m} matches at least n but not more than m consecutive foo

foo{,} matches any number of consecutive foo, including zero

In the absence of any explicit quantifier, the default for a star is to match any number of labels (i.e., {,}), while the default for a non-star item is to match exactly once (i.e., {1}).

There are several modifiers that can be placed at the end of a non-star lquery item to make it match more than just an exact match:

@ performs case-insensitive matching, e.g., a@ matches A

matches any label with this prefix, e.g., foo* matches foobar

% matches words separated by underscores at the beginning

The behavior of % is somewhat complex. It tries to match words rather than the entire label. For example, foo_bar% matches foo_bar_baz but not foo_barbaz. If combined with , prefix matching can be applied to each word individually, e.g., foo_bar% matches foo1_bar2_baz but not foo1_br2_baz.

In addition, you can write multiple possibly modified non-star items separated by | (OR) to match any of those items, and you can put ! (NOT) at the beginning of a non-star group to match any label not matching those alternatives. Quantifiers, if any, go at the end of the group; they mean some number of matches for the group as a whole (that is, some number of labels matching or not matching any alternative).

Here is an example of an lquery:

Top.{0,2}.sport@.!football|tennis{1,}.Russ*|Spain

a. b. c. d. e.

This query will match any label path that:

a. begins with the label Top

b. then has 0 to 2 labels

c. followed by a label beginning with the case-insensitive prefix sport

d. then has one or more labels, none matching football or tennis

e. and ends with a label beginning with Russ, or a label exactly matching Spain

• ltxtquery represents a full-text-search-like pattern for matching ltree values. An ltxtquery value contains words, possibly with the modifiers @, *, % at the end, which have the same meanings as in lquery. Words can be combined using & (AND), | (OR), ! (NOT), and parentheses. The key difference between lquery and ltxtquery is that the former matches words without regard to their position in the label path.

Here is an example of an ltxtquery:

Europe & Russia*@ & !Transportation

This will match paths containing the label Europe and any label beginning with Russia (case-insensitive), but not paths containing the label Transportation. The position of these words in the path is not important. Also, when using %, the word can match any underscore-separated word within a label, regardless of its position.

Note: ltxtquery allows whitespace between symbols, but ltree and lquery do not.

2. Operators and Functions

Type ltree has the usual comparison operators =, <>, <, >, <=, >=. Comparisons are sorted in tree traversal order, with the children of a node sorted by label text. In addition, there are the special operators shown in Table C.13.

Table C.13. ltree Operators

Operator/Description
ltree @> ltree → boolean Left argument is an ancestor of the right argument (or equal)
ltree <@ ltree → boolean Left argument is a descendant of the right argument (or equal)
ltree ~ lquery → boolean lquery ~ ltree → boolean ltree matches lquery
ltree ? lquery[] → boolean lquery[] ? ltree → boolean ltree matches any lquery in the array
ltree @ ltxtquery → boolean ltxtquery @ ltree → boolean ltree matches ltxtquery
ltree [] @> ltree → boolean Array contains an ancestor of ltree
ltree [] <@ ltree → boolean Array contains a descendant of ltree
ltree[] @> ltree → boolean ltree <@ ltree[] → boolean Array contains an ancestor of ltree
ltree[] <@ ltree → boolean ltree @> ltree[] → boolean Array contains a descendant of ltree
ltree[] ~ lquery → boolean lquery ~ ltree[] → boolean Array contains any path matching lquery
ltree[] ? lquery[] → boolean lquery[] ? ltree[] → boolean Array contains any path matching any lquery
ltree[] @ ltxtquery → boolean ltxtquery @ ltree[] → boolean Array contains any path matching ltxtquery
ltree[] ?@> ltree → ltree Returns the first array entry that is an ancestor of ltree, or NULL if none
ltree[] ?<@ ltree → ltree Returns the first array entry that is a descendant of ltree, or NULL if none
ltree[] ?~ lquery → ltree Returns the first array entry matching lquery, or NULL if none
ltree[] ?@ ltxtquery → ltree Returns the first array entry matching ltxtquery, or NULL if none

The operators <@, @>, @, and ~ have analogous operators ^<@, ^@>, ^@, ^~ that do not use indexes. They are useful only for testing purposes.

The available functions are listed in Table C.14.

Table C.14. ltree Functions

Function/Description/Example
subltree ( ltree, start integer, end integer ) → ltree Returns the subpath of ltree from position start to position end-1 (counting from 0). subltree('Top.Child1.Child2', 1, 2) → Child1
subpath ( ltree, offset integer, len integer ) → ltree Returns the subpath of ltree starting at position offset with length len. If offset is negative, the subpath starts from that many labels from the end of the path. If len is negative, that many labels are left off the end of the path. subpath('Top.Child1.Child2', 0, 2) → Top.Child1
subpath ( ltree, offset integer ) → ltree Returns the subpath of ltree starting at position offset, extending to the end of the path. If offset is negative, the subpath starts from that many labels from the end of the path. subpath('Top.Child1.Child2', 1) → Child1.Child2
nlevel ( ltree ) → integer Returns the number of labels in the path. nlevel('Top.Child1.Child2') → 3
index ( a ltree, b ltree ) → integer Returns the position of the first occurrence of b in a, or -1 if not found. index('0.1.2.3.5.4.5.6.8.5.6.8', '5.6') → 6
index ( a ltree, b ltree, offset integer ) → integer Returns the position of the first occurrence of b in a, or -1 if not found. The search starts at position offset; a negative offset means starting -offset labels from the end of the path. index('0.1.2.3.5.4.5.6.8.5.6.8', '5.6', -4) → 9
text2ltree ( text ) → ltree Converts text to ltree ltree2text ( ltree ) → text Converts ltree to text
lca ( ltree [, ltree [, ... ]] ) → ltree Computes the longest common ancestor of the paths (supports up to 8 arguments). lca('1.2.3', '1.2.3.4.5.6') → 1.2
lca ( ltree[] ) → ltree Computes the longest common ancestor of the paths in the array. lca(array['1.2.3'::ltree,'1.2.3.4']) → 1.2

3. Indexes

ltree supports several index types that can speed up the above operators:

• B-tree indexes on ltree: <, <=, =, >=, >

• GiST indexes on ltree (gist_ltree_ops opclass): <, <=, =, >=, >, @>, <@, @, ~, ?

The gist_ltree_ops GiST opclass approximates a set of path labels as a bitmap signature. Its optional integer parameter siglen determines the signature length in bytes. The default signature length is 8 bytes. Valid values for signature length are between 1 and 2024 bytes. Longer signatures result in more precise searches (scanning a smaller portion of the index and fewer heap pages), but at the cost of a larger index.

Example of creating an index with the default signature length of 8 bytes:

CREATE INDEX path_gist_idx ON test USING GIST (path);

Example of creating an index with a signature length of 100 bytes:

CREATE INDEX path_gist_idx ON test USING GIST (path

gist_ltree_ops(siglen=100));

• GiST indexes on ltree[] (gist__ltree_ops opclass): ltree[] <@ ltree, ltree @> ltree[], @, ~, ?

The gist__ltree_ops GiST opclass works similarly to gist_ltree_ops and also uses signature length as a parameter. The default value of siglen in gist__ltree_ops is 28 bytes.

Example of creating an index with the default signature length of 28 bytes:

CREATE INDEX path_gist_idx ON test USING GIST (array_path);

Example of creating an index with a signature length of 100 bytes:

CREATE INDEX path_gist_idx ON test USING GIST (array_path

gist__ltree_ops(siglen=100));

Note: This index type is lossy.

4. Examples

This example uses the following data:

CREATE TABLE test (path ltree);

INSERT INTO test VALUES ('Top');

INSERT INTO test VALUES ('Top.Science');

INSERT INTO test VALUES ('Top.Science.Astronomy');

INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');

INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');

INSERT INTO test VALUES ('Top.Hobbies');

INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');

INSERT INTO test VALUES ('Top.Collections');

INSERT INTO test VALUES ('Top.Collections.Pictures');

INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');

INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');

INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');

INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');

CREATE INDEX path_gist_idx ON test USING GIST (path);

CREATE INDEX path_idx ON test USING BTREE (path);

Now we have a table test populated with data describing the following hierarchy:

Top

/ |

Science Hobbies Collections

/ |

Astronomy Amateurs_Astronomy Pictures

/ |

Astrophysics Cosmology Astronomy

/ |

Galaxies Stars Astronauts

We can query the descendants:

test=## SELECT path FROM test WHERE path <@ 'Top.Science';

path

------------------------------------

Top.Science

Top.Science.Astronomy

Top.Science.Astronomy.Astrophysics

Top.Science.Astronomy.Cosmology

(4 rows)

Here are some path matching examples:

test=## SELECT path FROM test WHERE path ~ '*.Astronomy.*';

path

-----------------------------------------------

Top.Science.Astronomy

Top.Science.Astronomy.Astrophysics

Top.Science.Astronomy.Cosmology

Top.Collections.Pictures.Astronomy

Top.Collections.Pictures.Astronomy.Stars

Top.Collections.Pictures.Astronomy.Galaxies

Top.Collections.Pictures.Astronomy.Astronauts

(7 rows)

test=## SELECT path FROM test WHERE path @'Astro*&!pictures@ ';

path

------------------------------------

Top.Science.Astronomy

Top.Science.Astronomy.Astrophysics

Top.Science.Astronomy.Cosmology

(3 rows)

Here are some full-text search examples:

test=## SELECT path FROM test WHERE path @ 'Astro*% & !pictures@';

path

------------------------------------

Top.Science.Astronomy

Top.Science.Astronomy.Astrophysics

Top.Science.Astronomy.Cosmology

Top.Hobbies.Amateurs_Astronomy

(4 rows)

test=## SELECT path FROM test WHERE path @ 'Astro* & !pictures@';

path

------------------------------------

Top.Science.Astronomy

Top.Science.Astronomy.Astrophysics

Top.Science.Astronomy.Cosmology

(3 rows)

Path construction using functions:

test=## SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test test-## WHERE

test-## path <@ 'Top.Science.Astronomy';

?column?

------------------------------------------

Top.Science.Space.Astronomy

Top.Science.Space.Astronomy.Astrophysics

Top.Science.Space.Astronomy.Cosmology

(3 rows)

We can simplify this by creating an SQL function that inserts a label at a specified position in the path:

test=## CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree

test-## AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'

test-## LANGUAGE SQL IMMUTABLE;

CREATE FUNCTION

test=## SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy';

ins_label

------------------------------------------

Top.Science.Space.Astronomy

Top.Science.Space.Astronomy.Astrophysics

Top.Science.Space.Astronomy.Cosmology

(3 rows)

5. Transforms

There are additional extensions that implement transforms for the ltree type for PL/Python.

These extensions are ltree_plpythonu, ltree_plpython2u, and ltree_plpython3u. If these transforms are installed and specified when creating functions, ltree values are mapped to Python lists (however, the reverse transform is not currently supported).

Caution: It is strongly recommended that transform extensions be installed in the same schema as ltree. Otherwise, there are security concerns during installation if the transform extension's schema contains objects defined by malicious users.

mdx: format: md​

1. Definitions​

2. Operators and Functions​

3. Indexes​

4. Examples​

5. Transforms​