pgcrypto
The pgcrypto module provides cryptographic functions for Halo.
This module is considered "trusted", that is, it can be installed by non-superusers who have CREATE privilege on the current database.
1. General Hashing Functions
1.1. digest()
digest(data text, type text) returns bytea
digest(data bytea, type text) returns bytea
Computes a binary hash of the given data. type is the algorithm to use. Standard algorithms are md5, sha1, sha224, sha256, sha384, and sha512. If pgcrypto was compiled with OpenSSL, more algorithms are available as described in Table C.19.
If you want the digest as a hex string, you can use encode() on the result. For example:
test=## CREATE OR REPLACE FUNCTION sha1(bytea) returns text AS $$
test$## SELECT encode(digest($1, 'sha1'), 'hex')
test$## $$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE FUNCTION
1.2. hmac()
hmac(data text, key text, type text) returns bytea
hmac(data bytea, key bytea, type text) returns bytea
Computes a hashed MAC for data with key key. type is the same as in digest().
This is similar to digest(), but the hash can only be recalculated if the key is known. This prevents someone from modifying data and also changing the hash to match.
If the key is larger than the hash block size, it will first be hashed and the result will be used as the key.
2. Password Hashing Functions
The functions crypt() and gen_salt() are specifically designed for hashing passwords. crypt() does the hashing, while gen_salt() prepares algorithm parameters for the former.
The algorithms in crypt() differ from ordinary MD5 or SHA1 hashing algorithms in the following ways:
-
They are slow. Since the data volume is small, this is the only way to make brute-force password cracking more difficult.
-
They use a random value (called salt), so that users with the same password will get different encrypted passwords.
This also provides additional protection against reversing the algorithm.
-
They include the algorithm type in the result, so passwords hashed with different algorithms can coexist.
-
Some of them are adaptive — meaning that when computers become faster, you can adjust the algorithm to be slower without creating incompatibility with existing passwords.
Table C.16 lists the algorithms supported by the crypt() function.
Table C.16. Algorithms Supported by crypt()
| Algorithm | Max Password Length | Adaptive? | Salt Bits | Output Length | Description |
|---|---|---|---|---|---|
| bf | 72 | yes | 128 | 60 | Blowfish-based, variant 2a |
| md5 | unlimited | no | 48 | 34 | MD5-based encryption |
| xdes | 8 | yes | 24 | 20 | Extended DES |
| des | 8 | no | 12 | 13 | Native UNIX encryption |
2.1. crypt()
crypt(password text, salt text) returns text
Computes a crypt(3)-style hash of password. When storing a new password, you need to use gen_salt() to generate a new salt value. To verify a password, pass the stored hash value as the salt and test whether the result matches the stored value.
Example of setting a new password:
UPDATE ... SET pswhash = crypt('new password', gen_salt('md5'));
Example of authentication:
SELECT (pswhash = crypt('entered password', pswhash)) AS pswmatch FROM ... ;
This returns true if the entered password is correct.
2.2. gen_salt()
gen_salt(type text [, iter_count integer ]) returns text
Generates a new random salt string for use in crypt(). The salt string also tells crypt() which algorithm to use.
The type parameter specifies the hashing algorithm. Acceptable types are: des, xdes, md5, and bf.
The iter_count parameter allows the user to specify an iteration count for algorithms that use one. The higher the count, the longer it takes to hash the password and consequently the more time it takes to crack it. However, using too high a count could result in taking years to compute a single hash — this is not useful. If the iter_count parameter is omitted, the default iteration count will be used. Allowed iter_count values are algorithm-dependent, as shown in Table C.17.
Table C.17. Iteration Counts for crypt()
| Algorithm | Default | Minimum | Maximum |
|---|---|---|---|
| xdes | 725 | 1 | 16777215 |
| bf | 6 | 4 | 31 |
There is an additional restriction for the xdes algorithm: the iteration count must be an odd number.
To choose an appropriate iteration count, consider that the original DES encryption was designed to perform 4 hashes per second on the hardware of its time. Speeds below 4 hashes per second are likely to impair usability. Speeds above 100 hashes per second may be too fast.
Table C.18 gives an overview of the relative slowness of different hashing algorithms. The table shows the time required to try all character combinations for an 8-character password, assuming the password contains only lowercase letters or a mix of uppercase and lowercase letters and digits.
For the crypt-bf entries, the number after the slash is the iter_count parameter of gen_salt.
Table C.18. Hashing Algorithm Speed
| Algorithm | Hashes/sec | For [a-z] | For [A-Za-z0-9] | Duration relative to md5 hash |
|---|---|---|---|---|
| crypt-bf/8 | 1792 | 4 years | 3927 years | 100k |
| crypt-bf/7 | 3648 | 2 years | 1929 years | 50k |
| crypt-bf/6 | 7168 | 1 year | 982 years | 25k |
| crypt-bf/5 | 13504 | 188 days | 521 years | 12.5k |
| crypt-md5 | 171584 | 15 days | 41 years | 1k |
| crypt-des | 23221568 | 157.5 minutes | 108 days | 7 |
| sha1 | 37774272 | 90 minutes | 68 days | 4 |
| md5(hash) | 150085504 | 22.5 minutes | 17 days | 1 |
Note:
• The machine used is an Intel Mobile Core i3.
• The numbers for crypt-des and crypt-md5 algorithms are taken from John the Ripper v1.6.38 -test output.
• The numbers for md5 hash are from mdcrack 1.2.
• The numbers for sha1 are from lcrack-20031130-beta.
• The numbers for crypt-bf were collected using a simple program that loops over 1000 8-character passwords. With this method, I can show the speed for different iteration counts. For reference: john-test shows 13506 cycles/sec for crypt-bf/5 (the slight difference in results is consistent with the crypt-bf implementation in pgcrypto matching that in John the Ripper).
Note: "Trying all combinations" is not how things are typically done in practice. Password cracking is usually accomplished with the help of dictionaries that contain common words and their various transformations. Therefore, even somewhat word-like passwords may be cracked in significantly less time than the numbers suggested above, while a 6-character non-word-like password might or might not survive cracking.
3. PGP Encryption Functions
The functions here implement the encryption portion of the OpenPGP (RFC 4880) standard. Both symmetric-key and public-key encryption are supported.
An encrypted PGP message consists of two parts or packets:
• A packet containing a session key — an encrypted symmetric key or public key.
• A packet containing data encrypted with the session key.
When encrypting with a symmetric key (i.e., a password):
-
The given password is hashed using a String2Key (S2K) algorithm. This is similar to the crypt() algorithm — intentionally slow and using random salt — but it produces a full-length binary key.
-
If a separate session key is requested, a new random key is generated. Otherwise, the S2K key is used directly as the session key.
-
If the S2K key is used directly, then only the S2K settings are placed in the session key packet. Otherwise, the session key is encrypted with the S2K key and placed in the session key packet.
When encrypting with a public key:
-
A new random session key is generated.
-
It is encrypted with the public key and placed in the session key packet.
In both cases, the data to be encrypted is processed as follows:
-
Optional data manipulation: compression, conversion to UTF-8, or line-ending conversion.
-
The data is prefixed with a random block of bytes. This is equivalent to using a random IV.
-
A SHA1 hash of the random prefix and data is appended.
-
All of this is encrypted with the session key and placed in the data packet.
3.1. pgp_sym_encrypt()
pgp_sym_encrypt(data text, psw text [, options text ]) returns bytea
pgp_sym_encrypt_bytea(data bytea, psw text [, options text ]) returns bytea
Encrypts data using a symmetric PGP key psw. The options parameter can contain option settings as described below.
3.2. pgp_sym_decrypt()
pgp_sym_decrypt(msg bytea, psw text [, options text ]) returns text
pgp_sym_decrypt_bytea(msg bytea, psw text [, options text ]) returns bytea
Decrypts a PGP message that was encrypted with a symmetric key.
Using pgp_sym_decrypt to decrypt bytea data is not allowed. This is to avoid outputting invalid character data. Using pgp_sym_decrypt_bytea to decrypt raw text data is fine.
The options parameter can contain option settings as described below.
3.3. pgp_pub_encrypt()
pgp_pub_encrypt(data text, key bytea [, options text ]) returns bytea
pgp_pub_encrypt_bytea(data bytea, key bytea [, options text ]) returns bytea
Encrypts data using a public PGP key key. Supplying a private key to this function will produce an error. The options parameter can contain option settings as described below.
3.4. pgp_pub_decrypt()
pgp_pub_decrypt(msg bytea, key bytea [, psw text [, options text ]]) returns text
pgp_pub_decrypt_bytea(msg bytea, key bytea [, psw text [, options text ]])
returns bytea
Decrypts a public-key encrypted message. key must be the private key corresponding to the public key used for encryption. If the private key is password-protected, you must provide the password in psw. If there is no password but you want to specify options, you need to provide an empty password.
Using pgp_pub_decrypt to decrypt bytea data is not allowed. This is to avoid outputting invalid character data. Using pgp_pub_decrypt_bytea to decrypt raw text data is fine.
The options parameter can contain option settings as described below.
3.5. pgp_key_id()
pgp_key_id(bytea) returns text
pgp_key_id extracts the key ID of a PGP public or private key. Or, if given an encrypted message, it returns the key ID that was used to encrypt the data.
It can return 2 special key IDs:
• SYMKEY
The message was encrypted with a symmetric key.
• ANYKEY
The message was encrypted with a public key, but the key ID has been removed. This means you will need to try all your keys to see which one can decrypt the message. pgcrypto itself does not produce such messages.
Note: Different keys may have the same ID. This is rare but a normal occurrence. Client applications should try to decrypt with each one to see which fits — the same as handling ANYKEY.
3.6. armor(), dearmor()
armor(data bytea [ , keys text[], values text[] ]) returns text
dearmor(data text) returns bytea
These functions wrap/unwrap binary data into PGP ASCII-armor format, which is essentially Base64 with CRC and additional formatting.
If keys and values arrays are specified, an armor header is added for each key/value pair to the armored format. Both arrays must be single-dimensional and of the same length. Keys and values cannot contain any non-ASCII characters.
3.7. pgp_armor_headers
pgp_armor_headers(data text, key out text, value out text) returns setof record
pgp_armor_headers() extracts armor headers from data. The return value is a set of rows with two columns: key and value. If keys or values contain any non-ASCII characters, they are treated as UTF-8.
3.8. Options for PGP Functions
Options are named similarly to GnuPG. An option's value should be given after an equals sign, with options separated by commas. For example:
pgp_sym_encrypt(data, psw, 'compress-algo=1, cipher-algo=aes256')
All of these options except convert-crlf apply only to encryption functions. Decryption functions get these parameters from the PGP data.
The most interesting options are likely compress-algo and unicode-mode. The rest should work with reasonable defaults.
## 3.8.1. cipher-algo
Which cipher algorithm to use.
Values: bf, aes128, aes192, aes256 (OpenSSL only: 3des, cast5)
Default: aes128
Applies to: pgp_sym_encrypt, pgp_pub_encrypt
## 3.8.2. compress-algo
Which compression algorithm to use. Only available when Halo was compiled with zlib.
Values:
0 - no compression
1 - ZIP compression
2 - ZLIB compression (= ZIP plus metadata and block CRC)
Default: 0
Applies to: pgp_sym_encrypt, pgp_pub_encrypt
## 3.8.3. compress-level
How much to compress. Higher levels produce smaller but slower compression.
0 disables compression.
Values: 0, 1-9
Default: 6
Applies to: pgp_sym_encrypt, pgp_pub_encrypt
## 3.8.4. convert-crlf
Whether to convert \n to \r\n when encrypting and \r\n to \n when decrypting.
RFC 4880 specifies that text data should be stored with \r\n line endings.
Using this option provides fully RFC-compliant behavior.
Values: 0, 1
Default: 0
Applies to: pgp_sym_encrypt, pgp_pub_encrypt, pgp_sym_decrypt, pgp_pub_decrypt
## 3.8.5. disable-mdc
Do not protect data with SHA-1. The only good reason to use this option is for compatibility with antique PGP products that existed before SHA-1-protected packets were added to RFC 4880. Recent gnupg.org and pgp.com software supports it well.
Values: 0, 1
Default: 0
Applies to: pgp_sym_encrypt, pgp_pub_encrypt
## 3.8.6. sess-key
Use a separate session key. Public-key encryption always uses a separate session key. This option is for symmetric-key encryption, which by default uses the S2K key directly.
Values: 0, 1
Default: 0
Applies to: pgp_sym_encrypt
## 3.8.7. s2k-mode
Which S2K algorithm to use.
Values:
0 - No salt. Dangerous!
1 - Use salt but with a fixed iteration count.
3 - Variable iteration count.
Default: 3
Applies to: pgp_sym_encrypt
## 3.8.8. s2k-count
The number of iterations for the S2K algorithm. It must be a value between 1024 and 65011712, inclusive.
Default: A random value between 65536 and 253952
Applies to: pgp_sym_encrypt, only for s2k-mode=3
## 3.8.9. s2k-digest-algo
Which digest algorithm to use in S2K calculations.
Values: md5, sha1
Default: sha1
Applies to: pgp_sym_encrypt
## 3.8.10. s2k-cipher-algo
Which cipher to use for encrypting the separate session key.
Values: bf, aes, aes128, aes192, aes256
Default: use cipher-algo
Applies to: pgp_sym_encrypt
## 3.8.11. unicode-mode
Whether to convert text data between the database's internal encoding and UTF-8. If your database is already UTF-8, no conversion occurs, but the message will be tagged as UTF-8. Without this option, it will not be tagged.
Values: 0, 1
Default: 0
Applies to: pgp_sym_encrypt, pgp_pub_encrypt
3.9. Generating PGP Keys with GnuPG
To generate a new key:
gpg --gen-key
The preferred key type is "DSA and Elgamal".
For RSA keys, you must create a DSA or RSA key as the master key for signing only, then add an RSA encryption subkey using gpg --edit-key.
To list keys:
gpg --list-secret-keys
To export a public key in ASCII-armor format:
gpg -a --export KEYID > public.key
To export a private key in ASCII-armor format:
gpg -a --export-secret-keys KEYID > secret.key
Before passing these keys to the PGP functions, you need to apply dearmor() to them. Or, if you can handle binary data, you can remove the -a flag from the commands.
3.10. Limitations of the PGP Code
• Signatures are not supported. This also means it does not check whether an encryption subkey belongs to the master key.
• Encryption keys as master keys are not supported. Since this usage is generally discouraged, this should not be a problem.
• Multiple subkeys are not supported. This might seem like a problem since multiple subkeys are commonly needed in practice. On the other hand, you should not use your regular GPG/PGP keys with pgcrypto but rather create new ones, since the use cases are quite different.
4. Raw Encryption Functions
These functions run only a single pass of encryption over the data; they do not have any of the advanced features of PGP encryption. As a result, they have some significant issues:
-
They use the user's key directly as the encryption key.
-
They do not provide any integrity check to see if the encrypted data has been modified.
-
They expect the user to manage all encryption parameters, including the IV.
-
They cannot handle text.
Therefore, with the introduction of PGP encryption, the use of raw encryption functions is discouraged.
encrypt(data bytea, key bytea, type text) returns bytea
decrypt(data bytea, key bytea, type text) returns bytea
encrypt_iv(data bytea, key bytea, iv bytea, type text) returns bytea
decrypt_iv(data bytea, key bytea, iv bytea, type text) returns bytea
Encrypt/decrypt data using the cipher method specified by type. The syntax of the type string is:
algorithm [ - mode ] [ /pad: padding ]
where algorithm is one of:
• bf — Blowfish
• aes — AES (Rijndael-128, -192, or -256)
and mode is one of:
• cbc — next block depends on previous (default)
• ecb — each block is encrypted independently (only for testing)
and padding is one of:
• pkcs — data can be any length (default)
• none — data must be a multiple of the cipher block size
So, for example, these are equivalent:
encrypt(data, 'fooz', 'bf')
encrypt(data, 'fooz', 'bf-cbc/pad:pkcs')
In encrypt_iv and decrypt_iv, the iv parameter is the initial value for CBC mode; ECB ignores it. If it is not exactly the block size, it is trimmed or zero-padded. In functions without this parameter, its value defaults to all zeros.
#25.5. Random Data Functions
gen_random_bytes(count integer) returns bytea
Returns count cryptographically strong random bytes. At most 1024 bytes can be extracted at a time. This is to avoid depleting the random number generator pool.
gen_random_uuid() returns uuid
Returns a version 4 (random) UUID.
6. Notes
6.1. Configuration
pgcrypto configures itself according to the main Halo configure script. The options that affect it are --with-zlib and --with-openssl.
When compiled with zlib, the PGP encryption functions can compress data before encrypting.
When compiled with OpenSSL, more algorithms are available. Public-key encryption functions will also be faster because OpenSSL has more optimized BIGNUM functions.
Table C.19. Feature Summary With and Without OpenSSL
| Feature | Built-in | With OpenSSL |
|---|---|---|
| MD5 | yes | yes |
| SHA1 | yes | yes |
| SHA224/256/384/512 | yes | yes |
| Other digest algorithms | no | yes (note 1) |
| Blowfish | yes | yes |
| AES | yes | yes |
| DES/3DES/CAST5 | no | yes |
| Raw encryption | yes | yes |
| PGP | yes | yes |
| PGP | yes | yes |
Note:
- Any digest algorithm supported by OpenSSL is automatically selected. This is not possible for ciphers, which need to be explicitly supported.
6.2. NULL Handling
All functions return NULL whenever any argument is NULL. This may pose security risks when used improperly.
6.3. Security Limitations
All pgcrypto functions run inside the database server. This means that all data and passwords moving between pgcrypto and the client application are in plaintext. Therefore, you must:
-
Connect locally or use SSL connections.
-
Trust the system administrator and database administrator.
If you cannot do this, it is better to perform encryption in the client application.
The implementation is not resistant to side-channel attacks. For example, the time required for a pgcrypto decryption function to complete varies with the ciphertext size.