Skip to main content
Version: 1.0.16

tablefunc

mdx: format: md

The tablefunc module includes multiple functions that return tables (i.e., multiple rows). These functions are useful in their own right and can also serve as examples of how to write C functions that return multiple rows.

This module is considered "trusted", that is, it can be installed by non-superusers who have CREATE privilege on the current database.

1. Functions Provided

Table C.30 summarizes the functions provided by the tablefunc module.

Table C.30. tablefunc Functions

Function/Brief
normal_rand ( numvals integer, mean float8, stddev float8 ) → setof float8 Produces a set of normally distributed random values.
crosstab ( sql text ) → setof record Generates a "pivot table" containing row names plus N value columns, where N is determined by the row type specified in the calling query.
crosstabN ( sql text ) → setof table_crosstab_N Produces a "pivot table" containing row names plus N value columns. crosstab2, crosstab3, and crosstab4 are predefined, but you can create additional crosstabN functions as described below.
crosstab ( source_sql text, category_sql text ) → setof record Produces a "pivot table" whose value columns are determined by a second query.
crosstab ( sql text, N integer ) → setof record Deprecated version of crosstab(text). The parameter N is now ignored since the number of value columns is always determined by the calling query.
connectby ( relname text, keyid_fld text, parent_keyid_fld text [,orderby_fld text ], start_with text, max_depth integer [, branch_delim text ] ) → setof record Produces a display of a hierarchical tree structure.

1.1. normal_rand

normal_rand(int numvals, float8 mean, float8 stddev) returns setof float8

normal_rand produces a set of normally distributed random values (Gaussian distribution).

numvals is the number of values returned by the function. mean is the mean of the normal distribution of values, and stddev is the standard deviation of the normal distribution of values.

For example, this call requests 1000 values with a mean of 5 and a standard deviation of 3:

test=## SELECT * FROM normal_rand(1000, 5, 3);

normal_rand

----------------------

6.558678074264742

3.788525831704606

5.553842704505796

9.753656749832347

6.123704750357675

5.499242993736144

8.766552109565838

.

.

.

2.702899991153945

1.3775170519858717

9.663611813628377

(1000 rows)

1.2. crosstab(text)

crosstab(text sql)

crosstab(text sql, int N)

The crosstab function is used to produce a "pivot" display, where data is laid out across the page rather than listed vertically. For example, we might have data like this:

row1 val11

row1 val12

row1 val13

...

row2 val21

row2 val22

row2 val23

...

And we want to display it like this:

row1 val11 val12 val13 ...

row2 val21 val22 val23 ...

...

The crosstab function takes a text parameter that is an SQL query producing raw data formatted in the first manner, and produces a table formatted in the second manner.

The sql parameter is an SQL statement that produces the source set of data. This statement must return a row_name column, a category column, and a value column. N is a deprecated parameter that is ignored even if provided (previously this had to match the number of output value columns, but this is now determined by the calling query).

For example, the provided query might produce a set like this:

row_name cat value

----------+-------+-------

row1 cat1 val1

row1 cat2 val2

row1 cat3 val3

row1 cat4 val4

row2 cat1 val5

row2 cat2 val6

row2 cat3 val7

row2 cat4 val8

The crosstab function is declared to return setof record, so the actual names and types of the output columns must be defined in the FROM clause of the calling SELECT statement. For example:

SELECT * FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 text);

This example produces a set like this:

<== value columns ==>

row_name category_1 category_2

----------+------------+------------

row1 val1 val2

row2 val5 val6

The FROM clause must define the output as a row_name column (with the same data type as the first result column of the SQL query), followed by N value columns (all with the same data type as the third result column of the SQL query). You can set as many output value columns as you wish. The names of the output columns are up to you.

The crosstab function produces one output row for each consecutive group of input rows with the same row_name value. It fills the output value columns from left to right using the value fields from these rows. If a group has fewer rows than output value columns, the extra output columns are filled with null values. If there are more rows, the extra input rows are skipped.

In practice, the SQL query should always specify ORDER BY 1,2 to ensure that the input rows are properly sorted, i.e., rows with the same row_name are grouped together and correctly ordered within the group. Note that crosstab itself does not pay attention to the second column of the query result; it is there solely for sorting purposes, to control the order in which the third column values appear on the page.

Here is a complete example:

CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);

INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');

INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');

INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');

INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');

INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');

INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');

INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');

INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');

SELECT * FROM crosstab(

'select rowid, attribute, value

from ct

where attribute = ''att2'' or attribute = ''att3''

order by 1,2')

AS ct(row_name text, category_1 text, category_2 text, category_3 text);

row_name | category_1 | category_2 | category_3

----------+------------+------------+------------

test1 | val2 | val3 |

test2 | val6 | val7 |

(2 rows)

You can avoid always having to write out a FROM clause to define the output columns by setting up a custom crosstab function that hardcodes the desired output row type in its definition. This is described in the next section. Another possibility is to embed the required FROM clause in a view definition.

1.3. crosstabN(text)

crosstabN(text sql)

The crosstabN family of functions are examples of how to set up custom wrappers for the general crosstab function so that you don't need to write out column names and types in the calling SELECT query. The tablefunc module includes crosstab2, crosstab3, and crosstab4, whose input row types are defined as:

test=## CREATE TYPE tablefunc_crosstab_N AS (

test(## row_name TEXT,

test(## category_1 TEXT,

test(## category_2 TEXT,

test(## category_N TEXT

test(## );

CREATE TYPE

So, when the input query produces columns row_name and value of type text, and you want 2, 3, or 4 output value columns, these functions can be used directly. In all other respects, they behave exactly like the general crosstab function described above.

For example, the example given in the previous section could also be done this way:

test=## SELECT *FROM crosstab3(

test(## 'select rowid, attribute, value

test'## from ct

test'## where attribute = ''att2'' or attribute = ''att3''

test'## order by 1,2');

row_name | category_1 | category_2 | category_3

----------+------------+------------+------------

test1 | val2 | val3 |

test2 | val6 | val7 |

These functions are primarily provided for example purposes. You can create your own return types and functions based on the underlying crosstab() function. There are two ways to do this:

• Similar to contrib/tablefunc/tablefunc--1.0.sql, create a composite type to describe the desired output columns. Then define a unique function name that accepts a text parameter and returns setof your_type_name, but links to the same underlying crosstab C function. For example, if your source data produces row names of type text and values of type float8, and you want 5 value columns:

est=## CREATE TYPE my_crosstab_float8_5_cols AS (

test(## my_row_name text,

test(## my_category_1 float8,

test(## my_category_2 float8,

test(## my_category_3 float8,

test(## my_category_4 float8,

test(## my_category_5 float8

test(## );

CREATE TYPE

test=## CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(text)

test-## RETURNS setof my_crosstab_float8_5_cols

test-## AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;

CREATE FUNCTION

• Use OUT parameters to implicitly define the return type. The same example could also be done as:

CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(

IN text,

OUT my_row_name text,

OUT my_category_1 float8,

OUT my_category_2 float8,

OUT my_category_3 float8,

OUT my_category_4 float8,

OUT my_category_5 float8)

RETURNS setof record

AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;

1.4. crosstab(text, text)

crosstab(text source_sql, text category_sql)

The main limitation of the single-argument form of crosstab is that it treats all values in a group as alike and inserts each value into the first available column. This does not work if you want value columns to correspond to specific data categories and some groups may not have data for certain categories. The two-argument form of crosstab handles this case by providing an explicit category list that corresponds to the output columns.

source_sql is an SQL statement that produces the source data set. This statement must return a row_name column, a category column, and a value column. There may also be one or more "extra" columns. The row_name column must be the first. The category and value columns must be the last two columns in that order. Any columns between row_name and category are treated as "extra". All rows with the same row_name value should have the same "extra" column values.

For example, source_sql might produce a set like this:

SELECT row_name, extra_col, cat, value FROM foo ORDER BY 1;

row_name extra_col cat value

----------+------------+-----+---------

row1 extra1 cat1 val1

row1 extra1 cat2 val2

row1 extra1 cat4 val4

row2 extra2 cat1 val5

row2 extra2 cat2 val6

row2 extra2 cat3 val7

row2 extra2 cat4 val8

category_sql is an SQL statement that produces the set of categories. This statement must return only one column. It must produce at least one row, otherwise an error is generated. Also, it must not produce duplicate values, otherwise an error is generated. category_sql might look like this:

SELECT DISTINCT cat FROM foo ORDER BY 1;

cat

-------

cat1

cat2

cat3

cat4

The crosstab function is declared to return setof record, so the actual names and types of the output columns must be defined in the FROM clause of the calling SELECT statement. For example:

SELECT * FROM crosstab('...', '...')

AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4

text);

This would produce a result like:

<== value columns ==>

row_name extra cat1 cat2 cat3 cat4

---------+-------+------+------+------+------

row1 extra1 val1 val2 val4

row2 extra2 val5 val6 val7 val8

The FROM clause must define the correct number of output columns and the correct data types. If there are N columns in the source_sql query result, the first N-2 columns must match the first N-2 output columns. The remaining output columns must have the same type as the last column of the source_sql query result, and their number must exactly match the number of rows in the source_sql query result.

The crosstab function produces one output row for each consecutive group of input rows with the same row_name value. The output row_name column and any "extra" columns are copied from the first row of the group. The output value columns are filled using the value field from rows with matching category values. If a row's category does not match any output of the category_sql query, its value is ignored. If a matching category does not appear in the group, the output column in any output row is filled with null. In practice, the source_sql query should always specify ORDER BY 1 to ensure that values with the same row_name are grouped together. However, the order of categories within a group does not matter. Also, it is very important to ensure that the order of the category_sql query output matches the specified output column order.

Here are two complete examples:

test=## create table sales(year int, month int, qty int);

CREATE TABLE

test=## insert into sales values(2007, 1, 1000);

INSERT 0 1

test=## insert into sales values(2007, 2, 1500);

INSERT 0 1

test=## insert into sales values(2007, 7, 500);

INSERT 0 1

test=## insert into sales values(2007, 11, 1500);

INSERT 0 1

test=## insert into sales values(2007, 12, 2000);

INSERT 0 1

test=## insert into sales values(2008, 1, 1000);

INSERT 0 1

test=## select * from crosstab(

test(## 'select year, month, qty from sales order by 1',

test(## 'select m from generate_series(1,12) m'

test(## ) as (

test(## year int,

test(## "Jan" int,

test(## "Feb" int,

test(## "Mar" int,

test(## "Apr" int,

test(## "May" int,

test(## "Jun" int,

test(## "Jul" int,

test(## "Aug" int,

test(## "Sep" int,

test(## "Oct" int,

test(## "Nov" int,

test(## "Dec" int

test(## );

year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec

------+------+------+-----+-----+-----+-----+-----+-----+-----+-----

2007 | 1000 | 1500 | | | | | 500 | | | | 1500 | 2000

2008 | 1000 | | | | | | | | | | |

(2 rows)

test=## CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);

CREATE TABLE

test=## INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');

INSERT 0 1

test=## INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');

INSERT 0 1

test=## INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');

INSERT 0 1

test=## INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');

INSERT 0 1

test=## INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');

INSERT 0 1

test=## INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March

test'## 2003');

INSERT 0 1

test=## INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');

INSERT 0 1

test=## SELECT * FROM crosstab

test-## (

test(## 'SELECT rowid, rowdt, attribute, val FROM cth ORDER BY 1',

test(## 'SELECT DISTINCT attribute FROM cth ORDER BY 1'

test(## )

test-## AS

test-## (

test(## rowid text,

test(## rowdt timestamp,

test(## temperature int4,

test(## test_result text,

test(## test_startdate timestamp,

test(## volts float8

test(## );

rowid | rowdt | temperature | test_result | test_startdate | volts

-------+---------------------+-------------+-------------+---------------------+--------

test1 | 2003-03-01 00:00:00 | 42 | PASS | | 2.6987

test2 | 2003-03-02 00:00:00 | 53 | FAIL | 2003-03-01 00:00:00 | 3.1234

(2 rows)

You can create predefined functions to avoid having to write out the result column names and types in every query.

# 1.5. connectby

connectby(text relname, text keyid_fld, text parent_keyid_fld

[, text orderby_fld ], text start_with, int max_depth

[, text branch_delim ])

The connectby function produces a display of hierarchical data stored in a table. The table must have a key column that uniquely identifies each row, and a parent key column that references its parent (if any). connectby can display a subtree starting from any row.

Table C.31 explains the parameters.

Table C.31. connectby Parameters

ParameterDescription
relnameName of the source relation
keyid_fldName of the key column
parent_keyid_fldName of the parent key column
orderby_fldName of the column for ordering siblings (optional)
start_withKey value of the starting row
max_depthMaximum depth to traverse, zero means unlimited depth
branch_delimString used to separate key values in branch output (optional)

The key column and parent key column can be of any data type, but they must be the same type. Note: The start_with value must be entered as a text string, regardless of the key column's type.

The connectby function is declared to return setof record, so the actual names and types of the output columns must be defined in the FROM clause of the calling SELECT statement. For example:

SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')

AS t(keyid text, parent_keyid text, level int, branch text, pos int);

The first two output columns are used for the current row's key and its parent row's key; they must match the type of the table's key column. The third output column is the depth in the tree and must be of type integer. If a branch_delim parameter is given, the next output column is the branch display and must be of type text. Finally, if an orderby_fld parameter is given, the last output column is a sequential number and must be of type integer.

The "branch" output column shows the path of keys used to reach the current row. The keys are separated by the specified branch_delim string. If branch display is not needed, the branch_delim parameter and the branch column can be omitted from the output column list.

If the order of children under the same parent is important, the orderby_fld parameter can be included to specify which column to use for ordering siblings. This column can be of any sortable data type. When and only when orderby_fld is specified, the output column list must include a final integer sequential number column.

Parameters representing table and column names are copied verbatim into the SQL queries generated internally by connectby. Therefore, if names are mixed-case or contain special characters, they should be enclosed in double quotes. You may also need to schema-qualify table names.

On large tables, performance will be poor unless there is an index on the parent key column.

It is important that the branch_delim string does not appear in any key value; otherwise, connectby may incorrectly report an infinite recursion error. Note that if no branch_delim is provided, a default value of ~ is used for recursion detection.

Here is an example:

CREATE TABLE connectby_tree(keyid text, parent_keyid text, pos int);

INSERT INTO connectby_tree VALUES('row1',NULL, 0);

INSERT INTO connectby_tree VALUES('row2','row1', 0);

INSERT INTO connectby_tree VALUES('row3','row1', 0);

INSERT INTO connectby_tree VALUES('row4','row2', 1);

INSERT INTO connectby_tree VALUES('row5','row2', 0);

INSERT INTO connectby_tree VALUES('row6','row4', 0);

INSERT INTO connectby_tree VALUES('row7','row3', 0);

INSERT INTO connectby_tree VALUES('row8','row6', 0);

INSERT INTO connectby_tree VALUES('row9','row5', 0);

-- with branch, but no orderby_fld (result order not guaranteed)

SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0,'~')

AS t(keyid text, parent_keyid text, level int, branch text);

keyid | parent_keyid | level | branch

-------+--------------+-------+---------------------

row2 | | 0 | row2

row4 | row2 | 1 | row2~row4

row6 | row4 | 2 | row2~row4~row6

row8 | row6 | 3 | row2~row4~row6~row8

row5 | row2 | 1 | row2~row5

row9 | row5 | 2 | row2~row5~row9

(6 rows)

-- without branch, without orderby_fld (result order not guaranteed)

SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)

AS t(keyid text, parent_keyid text, level int);

keyid | parent_keyid | level

-------+--------------+-------

row2 | | 0

row4 | row2 | 1

row6 | row4 | 2

row8 | row6 | 3

row5 | row2 | 1

row9 | row5 | 2

(6 rows)

-- with branch, with orderby_fld (note row5 comes before row4)

SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')

AS t(keyid text, parent_keyid text, level int, branch text, pos int);

keyid | parent_keyid | level | branch | pos

-------+--------------+-------+---------------------+-----

row2 | | 0 | row2 | 1

row5 | row2 | 1 | row2~row5 | 2

row9 | row5 | 2 | row2~row5~row9 | 3

row4 | row2 | 1 | row2~row4 | 4

row6 | row4 | 2 | row2~row4~row6 | 5

row8 | row6 | 3 | row2~row4~row6~row8 | 6

(6 rows)

-- without branch, with orderby_fld (note row5 comes before row4)

SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0)

AS t(keyid text, parent_keyid text, level int, pos int);

keyid | parent_keyid | level | pos

-------+--------------+-------+-----

row2 | | 0 | 1

row5 | row2 | 1 | 2

row9 | row5 | 2 | 3

row4 | row2 | 1 | 4

row6 | row4 | 2 | 5

row8 | row6 | 3 | 6

(6 rows)