Skip to main content
Version: 1.0.16

CREATE TYPE

#CREATE TYPE

CREATE TYPE — Define a new data type

Synopsis

CREATE TYPE name AS

( [ attribute_name data_type [ COLLATE collation ] [, ... ] ] )

CREATE TYPE name AS ENUM

( [ 'label' [, ... ] ] )

CREATE TYPE name AS RANGE (

SUBTYPE = subtype

[ , SUBTYPE_OPCLASS = subtype_operator_class ]

[ , COLLATION = collation ]

[ , CANONICAL = canonical_function ]

[ , SUBTYPE_DIFF = subtype_diff_function ]

)

CREATE TYPE name (

INPUT = input_function,

OUTPUT = output_function

[ , RECEIVE = receive_function ]

[ , SEND = send_function ]

[ , TYPMOD_IN = type_modifier_input_function ]

[ , TYPMOD_OUT = type_modifier_output_function ]

[ , ANALYZE = analyze_function ]

[ , INTERNALLENGTH = { internallength | VARIABLE } ]

[ , PASSEDBYVALUE ]

[ , ALIGNMENT = alignment ]

[ , STORAGE = storage ]

[ , LIKE = like_type ]

[ , CATEGORY = category ]

[ , PREFERRED = preferred ]

[ , DEFAULT = default ]

[ , ELEMENT = element ]

[ , DELIMITER = delimiter ]

[ , COLLATABLE = collatable ]

)

CREATE TYPE name

Description

CREATE TYPE registers a new data type in the current database. The user who defines the data type becomes its owner.

If a schema name is given, the type is created in the specified schema. Otherwise it is created in the current schema. The type name must be distinct from any existing type or domain in the same schema (because tables have associated data types, the type name must also be different from any existing table name in the same schema).

As shown in the syntax above, there are five forms of CREATE TYPE. They create composite types, enum types, range types, base types, and shell types respectively. The first four forms will be discussed in sequence below. A shell type is simply a placeholder for a type to be defined later; it is created by issuing a CREATE TYPE command with no parameters other than the type name. Shell types are needed as forward references when creating range types and base types.

Composite Types

The first form of CREATE TYPE creates a composite type. The composite type is specified by a list of attribute names and data types.

If the data type of an attribute is collatable, a collation can optionally be specified for the attribute. A composite type is essentially the same as a row type of a table, but using CREATE TYPE avoids creating an actual table if you only want to define a type. Standalone composite types are also useful, for example, as function arguments or return types.

To be able to create a composite type, you must have the USAGE privilege on all attribute types.

Enum Types

This form of CREATE TYPE creates an enum type. Enum types require a list of quoted labels, each of which must be no more than 64 bytes in length. An enum type with zero labels can be created, but it cannot be used to store values until at least one label is added using ALTER TYPE.

Range Types

The subtype of a range type can be any type with an associated B-tree operator class (to determine the ordering of the range type values). Typically, the default B-tree operator class of the subtype is used to determine the ordering. To use a non-default operator class, specify its name with subtype_opclass. If the subtype is collatable and you want to use a non-default collation for the range ordering, specify it with the collation option. The optional canonical function must accept one parameter of the range type being defined and return a value of the same type. It is used to convert range values to a canonical form when applicable.

Creating a canonical function is somewhat tricky because it must be defined before the range type is declared. To do this, you must first create a shell type, which is a placeholder type with only a name and an owner but no properties. This can be done by issuing the command CREATE TYPE name with no additional parameters. Then you can declare the function using the shell type as its argument and result, and finally declare the range type with the same name. This automatically replaces the shell type entry with a valid range type.

The optional subtype_diff function must accept two values of the subtype as arguments and return a double precision value representing the difference between the two given values. While this is optional, providing this function will make GiST indexes on the range type column more efficient.

Base Types

The fourth form of CREATE TYPE creates a new base type (scalar type). To create a new base type, you must be a superuser (this restriction is in place because an erroneous type definition could confuse or even crash the server).

Parameters can appear in any order (not just in the order shown above), and most are optional. Before defining the type, two or more functions must be registered (using CREATE FUNCTION). The support functions input_function and output_function are required, while receive_function, send_function, type_modifier_input_function, type_modifier_output_function, and analyze_function are optional. Typically, these functions must be written in C or another low-level language.

The input_function converts the type's external text representation to the internal representation used by the operators and functions defined for the type. The output_function performs the reverse conversion. The input function can be declared to take one argument of type cstring, or three arguments of types cstring, oid, and integer respectively. The first argument is the input text as a C string, the second argument is the OID of the type itself (or the element type OID for array types), and the third argument is the typmod of the target column (or -1 if not known). The input function must return a value of the data type itself. Typically, an input function should be declared STRICT. If not, when a NULL input value is read, it will be called with the first argument being NULL. In this case, the function must still return NULL, unless it raises an error (this situation is mainly intended to support domain input functions, which may need to reject NULL input). The output function must be declared to take one argument of the new data type. The output function must return type cstring.

The output function is not called for NULL values.

The optional receive_function converts the type's external binary representation to the internal representation. If this function is not provided, the type cannot participate in binary input. Binary representation can be converted to internal form at lower cost, but is less portable (for example, standard integer data types use network byte order as the external binary representation, while the internal representation uses the machine's native byte order). The receive function should perform sufficient checks to ensure the value is valid. The receive function can be declared to take one argument of type internal, or three arguments of types internal, oid, and integer respectively. The first argument is a pointer to a StringInfo buffer holding the received byte string.

The remaining optional arguments are the same as for the text input function. The receive function must return a value of the data type itself. Typically, a receive function should be declared STRICT. If not, when a NULL input value is read, it will be called with the first argument being NULL. In this case, the function must still return NULL, unless it raises an error (this situation is mainly intended to support domain receive functions, which may need to reject NULL input). Similarly, the optional send_function converts the internal representation to the external binary representation. If this function is not provided, the type cannot participate in binary output. The send function must be declared to take one argument of the new data type. The send function must return type bytea. The send function is not called for NULL values.

At this point you may be wondering how the input and output functions can be declared as having results or arguments of the new type, since they must be created before the new type itself. The answer is that the new type should first be defined as a shell type, which is a placeholder type with no attributes other than a name and owner. This can be done with the command CREATE TYPE name with no additional parameters. Then the I/O functions written in C can be defined to reference this shell type. Finally, CREATE TYPE with a full definition replaces the shell type with a complete, valid type definition, after which the new type can be used normally.

If the type supports modifiers (optional constraints attached to the type declaration, such as char(5) or numeric(30,2)), the optional type_modifier_input_function and type_modifier_output_function are required. The system allows user-defined types to have one or more simple constants or identifiers as modifiers. However, for storage in the system catalog, this information must be packable into a non-negative integer value. The declared modifiers are passed to the type_modifier_input_function as a cstring array. It must check the validity of the value (raising an error if the value is wrong), and if valid, return a non-negative integer value that will be stored in the "typmod" column. If the type has no type_modifier_input_function, type modifiers will be rejected.

The type_modifier_output_function converts the internal integer typmod value back to the proper form for user display. It must return a cstring value, which is the string to append to the type name. For example, the function for numeric might return (30,2). If the default display format is simply to place the stored typmod integer value in parentheses, the type_modifier_output_function can be omitted.

The optional analyze_function performs type-specific statistics collection for columns of the data type. By default, if the type has a default B-tree operator class, ANALYZE will attempt to collect statistics using the type's "equals" and "less-than" operators. This behavior is not appropriate for non-scalar types, so it can be overridden by specifying a custom analysis function. The analysis function must be declared to take one argument of type internal and return a boolean result. For the detailed API of analysis functions, see src/include/commands/vacuum.h.

Although only the I/O functions and other functions created for the type know the details of the new type's internal representation, certain properties of the internal representation must be registered with the system. The most important of these is internallength. Base data types can be fixed-length (in which case internallength is a positive integer) or variable-length (setting internallength to VARIABLE, internally represented by setting typlen to -1). The internal representation of all variable-length types must begin with a 4-byte integer giving the total length of the value; the length field is often encoded, and it is inadvisable to use it directly.

The optional PASSEDBYVALUE flag indicates that values of this data type should be passed by value rather than by reference. Pass-by-value types must be fixed-length, and their internal representation must not exceed the size of the Datum type (4 bytes on some machines, 8 bytes on others).

The alignment parameter specifies the storage alignment requirement for the data type. Allowed values correspond to alignment on 1, 2, 4, or 8 byte boundaries. Note that the alignment parameter for variable-length types must be at least 4, because they need to include an int4 as their first component.

The storage parameter allows the selection of a storage strategy for variable-length data types (only plain is allowed for fixed-length types). plain specifies that the type's data will always be stored inline and will not be compressed. extended specifies that the system will first try to compress a long data value, and if the data is still too long, the value will be moved out of the main table row. external allows values to be moved out of the main table, but the system will not attempt to compress them. main allows compression but discourages moving values out of the main table (if there is no other way to fit the row size, data items with this storage strategy will still be moved out of the main table, but they will be given priority to remain in the main table compared to extended and external items).

All storage values other than plain imply that the data type's functions can handle TOASTed values. The specified value merely determines the default TOAST storage strategy for a column of a TOASTable data type; users can select other strategies for columns using ALTER TABLE SET STORAGE.

The like_type parameter provides another way to specify the basic representation properties of a data type: by copying from an existing type. The values of internallength, passedbyvalue, alignment, and storage are copied from the specified type (these copied values can also be overridden by specifying values for these attributes in the LIKE clause, though this is not commonly done). Specifying representation in this way is particularly useful when the underlying implementation of the new type uses an existing type as a "carrier".

The category and preferred parameters can be used to help control which implicit casts are applied in ambiguous situations. Each data type belongs to a category named by a single ASCII character, and each type can be the "preferred" type within its category. The parser will prefer casting to the preferred type when it helps resolve overloaded functions or operators (but only from other types in the same category). For types that have no implicit casts to or from any other type, leaving these settings at their defaults is fine. However, for a group of related types with implicit casts, it is usually useful to mark them all as belonging to the same category and select one or two "most commonly used" types as the preferred types for that category. The category parameter is particularly useful when adding a user-defined type to an existing built-in category (such as numeric or string types). However, you can also create entirely new categories of user-defined types. For such categories, you may choose any ASCII character other than uppercase letters.

If you want columns of the data type to default to a non-null value, you can specify a default value. The default value can be specified with the DEFAULT keyword (such a default value can be overridden by an explicit DEFAULT clause attached to a particular column).

To specify that a type is an array type, use the ELEMENT keyword to specify the element type of the array. For example, to define an array of 4-byte integers (int4), you would specify ELEMENT = int4. See below for more details about array types.

To specify the delimiter for values in the external representation of arrays of this type, set delimiter to a specific character. The default delimiter is a comma (,). Note that the delimiter is associated with the array element type, not the array type itself.

If the optional boolean parameter collatable is true, column definitions and expressions of this type may carry collation information using the COLLATE clause. The implementation of functions operating on the type is responsible for actually using this information; simply marking the type as collatable does not automatically cause this information to be used.

Array Types

Whenever a user-defined type is created, the system automatically creates an associated array type, whose name consists of the element type's name prefixed with an underscore, and is automatically truncated if it exceeds NAMEDATALEN bytes (if the generated name conflicts with an existing type name, the process is repeated until a non-conflicting name is found). This implicitly created array type is variable-length and uses the built-in input and output functions (array_in and array_out). The array type follows any changes to its element type's owner or schema, and is dropped when the element type is dropped.

If the system automatically creates the correct array type, you might reasonably ask why there is an ELEMENT option. The only useful case for using ELEMENT is when you are creating a fixed-length type that internally is an array of multiple identical items, and you want to allow direct subscript access to these items in addition to the overall operations you plan to provide for the type. For example, the type point is represented as two floating-point numbers that can be accessed using point[0] and point[1]. Note that this capability only applies to fixed-length types whose internal form is exactly a sequence of identical fixed-length fields. Subscriptable variable-length types must have the generalized internal representation used by array_in and array_out. For historical reasons (which are clearly wrong but too late to change), subscripts for fixed-length array types start at zero rather than one, unlike variable-length arrays.

Parameters

name

The name (optionally schema-qualified) of the type to be created.

attribute_name

The name of an attribute (column) of the composite type.

data_type

The name of an existing data type to be a column of the composite type.

collation

The name of an existing collation to associate with a column of the composite type or with a range type.

label

A string representing the textual label associated with a value of the enum type.

subtype

The name of the element type of the range type; the ranges represented by the range type belong to this type.

subtype_operator_class

The name of the B-tree operator class for the subtype.

canonical_function

The name of the canonicalization function for the range type.

subtype_diff_function

The name of the difference function for the subtype.

input_function

The name of the function that converts data from the type's external text form to internal form.

output_function

The name of the function that converts data from the type's internal form to external text form.

receive_function

The name of the function that converts data from the type's external binary form to internal form.

send_function

The name of the function that converts data from the type's internal form to external binary form.

type_modifier_input_function

The name of the function that converts the type's modifier array to internal form.

type_modifier_output_function

The name of the function that converts the type's modifier's internal form to external text form.

analyze_function

The name of the function that performs statistical analysis for this data type.

internallength

A numeric constant specifying the byte length of the new type's internal representation. The default assumption is that it is variable-length.

alignment

The storage alignment requirement for this data type. If specified, it must be char, int2, int4, or double.

The default is int4.

storage

The storage strategy for this data type. If specified, it must be plain, external, extended, or main.

The default is plain.

like_type

The name of an existing data type with the same representation as the new type. The values of internallength, passedbyvalue, alignment, and storage are copied from this type (unless overridden by explicit specifications elsewhere in this CREATE TYPE command).

category

The category code (a single ASCII character) for this type. The default is 'U' for "user-defined type". Other standard category codes can be found in Table 51.63. To create a custom category, you can choose any other ASCII character.

preferred

True if this type is the preferred type within its type category, false otherwise. The default is false. Be very careful when creating a new preferred type within an existing type category, as this may cause surprising changes in behavior.

default

The default value for the data type. If omitted, the default is null.

element

The type being created is an array; this specifies the element type of the array.

delimiter

The delimiter between values in arrays composed of this type.

collatable

True if operations on this type can use collation information. The default is false.

Notes

Since there are no restrictions on the use of a data type once it has been created, creating a base type or range type is equivalent to granting public execute privileges on the functions mentioned in the type definition. This is usually not a problem for functions useful in type definitions. However, if you are designing a type that requires "secret" information for conversion to or from external forms, you should think twice. Therefore, the length limit for type names is one character less than for other names. While this is still generally the case, array type names may differ from this rule if the name reaches the maximum length or conflicts with another user type name starting with an underscore. Therefore, relying on this convention in code is no longer applicable. Now, you can use pg_type.typarray to locate the array type associated with a given type.

It is recommended to avoid using type names and table names that begin with an underscore. Although the server will change generated array type names to avoid conflicts with user-specified names, there is still a risk of confusion, especially for older client software that may assume type names starting with an underscore always represent arrays.

The method for creating a new base type is to first create its input function. The system first treats the new data type's name as the return type of the input function. In this case, a shell type is implicitly created and can be referenced in the remaining I/O function definitions. This approach is still valid but has been deprecated and may be disallowed in a future release. Also, to avoid polluting the system catalog with shell types due to typos in function definitions, this method of creating a shell type can only be used when the input function is written in C.

Examples

# This example creates a composite type and uses it in a function definition:

CREATE TYPE compfoo AS (f1 int, f2 text);

CREATE FUNCTION getfoo() RETURNS SETOF compfoo AS $$

SELECT fooid, fooname FROM foo

$$ LANGUAGE SQL;

# This example creates an enum type and uses it in a table definition:

CREATE TYPE bug_status AS ENUM ('new', 'open', 'closed');

CREATE TABLE bug (

id serial,

description text,

status bug_status

);

# This example creates a range type:

CREATE TYPE float8_range AS RANGE (subtype = float8, subtype_diff = float8mi);

# This example creates the base data type box and uses it in a table definition:

CREATE TYPE box;

CREATE FUNCTION my_box_in_function(cstring) RETURNS box AS ... ;

CREATE FUNCTION my_box_out_function(box) RETURNS cstring AS ... ;

CREATE TYPE box (

INTERNALLENGTH = 16,

INPUT = my_box_in_function,

OUTPUT = my_box_out_function

);

CREATE TABLE myboxes (

id integer,

description box

);

# If the internal structure of box is an array of four float4 elements, we might use:

CREATE TYPE box (

INTERNALLENGTH = 16,

INPUT = my_box_in_function,

OUTPUT = my_box_out_function,

ELEMENT = float4

);

# This allows subscript access to the component numbers of a box value. Otherwise the type behaves the same as before.

# This example creates a large object type and uses it in a table definition:

CREATE TYPE bigobj (

INPUT = lo_filein, OUTPUT = lo_fileout,

INTERNALLENGTH = VARIABLE

);

CREATE TABLE big_objs (

id integer,

obj bigobj

);

See Also

ALTER TYPE, CREATE DOMAIN, CREATE FUNCTION, DROP TYPE