Skip to main content
Version: 1.0.16

dict_int

dict_int is an example of an additional full-text search dictionary template. The motivation for this example dictionary is to control the indexing of integers (both signed and unsigned), allowing these numbers to be indexed while preventing excessive growth in the number of unique tokens (which can severely impact search performance).

This module is considered "trusted", meaning it can be installed by non-superusers who have CREATE privilege on the current database.

1. Configuration

The dictionary accepts three options:

  • The maxlen parameter specifies the maximum number of digits allowed in an integer word. The default value is 6.

  • The rejectlong parameter specifies whether an overlength integer should be truncated or ignored. If rejectlong is false (the default), the dictionary returns the first digits of the integer. If rejectlong is true, the dictionary treats an overlength integer as a stop word, so it will not be indexed. Note: This also means such an integer cannot be searched for.

  • The absval parameter specifies whether the "+" or "-" sign should be stripped from the integer. The default is false. When true, the sign is removed before maxlen is applied.

2. Usage

Installing the dict_int extension creates a text search template intdict_template and a dictionary intdict based on it, using default parameters. You can modify the parameters, for example:

test=## ALTER TEXT SEARCH DICTIONARY intdict (MAXLEN = 4, REJECTLONG = true);

ALTER TEXT SEARCH DICTIONARY

Or create new dictionaries based on the template.

To test the dictionary, you can try:

test=## select ts_lexize('intdict', '12345678');

ts_lexize

-----------

{}

(1 row)