Skip to main content
Version: 1.0.16

pg_stat_statements

The pg_stat_statements module provides a way to track planning and execution statistics for all SQL statements executed by the server.

This module must be loaded by adding pg_stat_statements to shared_preload_libraries in postgresql.conf, because it requires additional shared memory. This means that adding or removing the module requires a server restart.

When pg_stat_statements is loaded, it tracks statistics across all databases of the server.

The module provides a view pg_stat_statements and functions pg_stat_statements_reset and pg_stat_statements for accessing and manipulating these statistics. These views and functions are not globally available, but can be enabled for a specific database using CREATE EXTENSION pg_stat_statements.

1. pg_stat_statements View

The statistics collected by this module can be accessed through a view called pg_stat_statements. This view contains one row for each distinct combination of database ID, user ID, and query ID (the number of distinct statements the module can track). The columns of the view are shown in Table C.21.

Table C.21. pg_stat_statements Columns

Column Type/Description
userid oid (references pg_authid.oid) — OID of the user who executed the statement
dbid oid (references pg_database.oid) — OID of the database in which the statement was executed
queryid bigint — Internal hash code, computed from the statement's parse tree
query text — Text of the statement
plans bigint — Number of times the statement was planned (if pg_stat_statements.track_planning is enabled, otherwise zero)
total_plan_time double precision — Total time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero)
min_plan_time double precision — Minimum time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero)
max_plan_time double precision — Maximum time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero)
mean_plan_time double precision — Mean time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero)
stddev_plan_time double precision — Population standard deviation of time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero)
calls bigint — Number of times the statement was executed
total_exec_time double precision — Total time spent executing the statement, in milliseconds
min_exec_time double precision — Minimum time spent executing the statement, in milliseconds
max_exec_time double precision — Maximum time spent executing the statement, in milliseconds
mean_exec_time double precision — Mean time spent executing the statement, in milliseconds
stddev_exec_time double precision — Population standard deviation of time spent executing the statement, in milliseconds
rows bigint — Total number of rows retrieved or affected by the statement
shared_blks_hit bigint — Total number of shared block cache hits by the statement
shared_blks_read bigint — Total number of shared blocks read by the statement
shared_blks_dirtied bigint — Total number of shared blocks dirtied by the statement
shared_blks_written bigint — Total number of shared blocks written by the statement
local_blks_hit bigint — Total number of local block cache hits by the statement
local_blks_read bigint — Total number of local blocks read by the statement
local_blks_dirtied bigint — Total number of local blocks dirtied by the statement
local_blks_written bigint — Total number of local blocks written by the statement temp_blks_read bigint — Total number of temporary blocks read by the statement
temp_blks_written bigint — Total number of temporary blocks written by the statement
blk_read_time double precision — Total time spent reading blocks by the statement, in milliseconds (if track_io_timing is enabled, otherwise zero)
blk_write_time double precision — Total time spent writing blocks by the statement, in milliseconds (if track_io_timing is enabled, otherwise zero)
wal_records bigint — Total number of WAL records generated by the statement
wal_fpi bigint — Total number of WAL full page images generated by the statement
wal_bytes numeric — Total number of WAL bytes generated by the statement

For security reasons, only superusers and members of the pg_read_all_stats role are allowed to see the SQL text or queryid of queries executed by other users. However, if the view is installed in a database accessible to other users, they can see the statistics.

Whenever plannable queries (i.e., SELECT, INSERT, UPDATE, and DELETE) have the same query structure according to an internal hash calculation, they are grouped into a single pg_stat_statements entry. Typically, for the purposes here, two queries are considered identical if they are semantically equivalent except for the text constant values in the query.

However, utility commands (i.e., all other commands) are compared strictly based on their text query string.

When matching a query against other queries, constant values are ignored, and in the pg_stat_statements display they are replaced by a parameter symbol such as $1. The remaining query text is that of the first query that has the specific queryid hash value associated with the pg_stat_statements entry.

In some cases, queries with obviously different text may be merged into a single pg_stat_statements entry. Typically this only happens with semantically equivalent queries, but there is also a small chance that unrelated queries may be merged into an entry due to hash collisions (however, this does not happen for queries belonging to different users or databases).

Since the queryid hash value is computed from the post-parse-analysis expression of the query, the opposite can also be true: if identical text queries have different meanings due to factors such as different search_path settings, they may exist as different entries.

Users of pg_stat_statements may wish to use queryid (perhaps combined with dbid and userid) as a more stable and reliable identifier for an entry than the query text. However, it is important to note that there are limited guarantees regarding the stability of the queryid hash value. Because this identifier is derived from the post-parse-analysis tree, its value is a function of the internal object identifiers in this form. This has some counterintuitive implications. For example, two queries that reference the same table, but where the table is dropped and recreated between the two queries, are obviously identical, but pg_stat_statements will consider them different. The hashing is also sensitive to machine architecture and other platform differences.

Two servers participating in physical WAL replication will produce the same queryid values for the same queries. However, logical replication mode does not guarantee exact replication in all relevant details, so queryid is not a useful identifier when computing costs across logical replication nodes.

The parameter symbols used to replace constants in the representative query text start from the next number after the highest $n parameter in the original query text, or $1 if there are none. Note that in some cases, hidden parameter symbols may exist that affect the numbering.

The representative query text is stored in an external disk file and does not consume shared memory. Therefore, even very long query texts can be successfully stored. However, if many long query texts accumulate, the external file can grow very large. As a recovery measure, if this happens, pg_stat_statements may choose to discard the query texts, in which case all existing entries in the pg_stat_statements view will display an empty query column, but the statistics associated with each queryid will be preserved. If this occurs, consider reducing pg_stat_statements.max to prevent recurrence.

plans and calls do not always match, because planning and execution statistics are updated at their respective completion stages and only apply to successful operations. For example, if a statement is planned successfully but fails during the execution phase, only its planning statistics will be updated. If planning is skipped because a cached plan is used, only its execution statistics will be updated.

2. Functions

pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint) returns void

pg_stat_statements_reset discards the statistics collected so far by pg_stat_statements corresponding to the specified userid, dbid, and queryid. If any parameter is not specified, the default value 0 (invalid) is used for that parameter, and statistics matching the other parameters will be reset. If no parameters are specified, or all specified parameters are 0 (invalid), it will discard all statistics. By default, this function can only be executed by superusers. Access can be granted to others using GRANT.

pg_stat_statements(showtext boolean) returns setof record

The pg_stat_statements view is defined in terms of a function also called pg_stat_statements. Clients can call the pg_stat_statements function directly and specify showtext := false to omit the query text (i.e., the OUT parameter corresponding to the view's query column will return null values). This feature is designed to support external tools that do not wish to repeatedly receive variable-length query text. Such tools can instead cache the first observed query text on their own, since that is all that pg_stat_statements itself does, and only retrieve the query text when needed. Since the server stores query text in a file, this approach can reduce the physical I/O of repeatedly examining pg_stat_statements data.

3. Configuration Parameters

pg_stat_statements.max (integer)

pg_stat_statements.max is the maximum number of statements tracked by the module (i.e., the maximum number of rows in the pg_stat_statements view). If the number of distinct statements observed exceeds this number, information about the least-executed statements will be discarded. The default value is 5000. This parameter can only be set at server startup.

pg_stat_statements.track (enum)

pg_stat_statements.track controls which statements are counted by the module. Specify top to track top-level statements (those issued directly by clients), all to also track nested statements (such as statements called within functions), or none to disable statement statistics collection. The default value is top. Only superusers can change this setting.

pg_stat_statements.track_utility (boolean)

pg_stat_statements.track_utility controls whether the module tracks utility commands. Utility commands are all commands other than SELECT, INSERT, UPDATE, and DELETE. The default value is on. Only superusers can change this setting.

pg_stat_statements.track_planning (boolean)

pg_stat_statements.track_planning controls whether the module tracks planning operations and duration. Enabling this parameter may cause noticeable performance overhead, especially when a smaller variety of queries are executed across many concurrent connections. The default value is off. Only superusers can change this setting.

pg_stat_statements.save (boolean)

pg_stat_statements.save specifies whether statement statistics should be preserved after the server shuts down. If set to off, statistics are not saved on shutdown and are not reloaded when the server starts up. The default value is on. This parameter can only be set in the postgresql.conf file or on the server command line.

This module requires additional shared memory proportional to pg_stat_statements.max.

Note: This much memory is consumed whenever the module is loaded, even if pg_stat_statements.track is set to none.

These parameters must be set in postgresql.conf. A typical usage might be:

postgresql.conf

shared_preload_libraries = 'pg_stat_statements'

pg_stat_statements.max = 10000

pg_stat_statements.track = all

4. Sample Output

bench=## SELECT pg_stat_statements_reset();

$ pgbench -i bench

$ pgbench -c10 -t300 bench

bench=## x

bench=## SELECT query, calls, total_exec_time, rows, 100.0 * shared_blks_hit /

nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent

FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;

-[ RECORD 1 ]---+--------------------------------------------------

query | UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid

= $2

calls | 3000

total_exec_time | 25565.855387

rows | 3000

hit_percent | 100.0000000000000000

-[ RECORD 2 ]---+--------------------------------------------------

query | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid

= $2

calls | 3000

total_exec_time | 20756.669379

rows | 3000

hit_percent | 100.0000000000000000

-[ RECORD 3 ]---+--------------------------------------------------

query | copy pgbench_accounts from stdin

calls | 1

total_exec_time | 291.865911

rows | 100000

hit_percent | 100.0000000000000000

-[ RECORD 4 ]---+--------------------------------------------------

query | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid

= $2

calls | 3000

total_exec_time | 271.232977

rows | 3000

hit_percent | 98.8454011741682975

-[ RECORD 5 ]---+-------------------------------------------------

query | alter table pgbench_accounts add primary key (aid)

calls | 1

total_exec_time | 160.588563

rows | 0

hit_percent | 100.0000000000000000

=## SELECT pg_stat_statements_reset(0,0,s.queryid) FROM pg_stat_statements

AS s

WHERE s.query = 'UPDATE pgbench_branches SET bbalance = bbalance +

$1 WHERE bid = $2';

bench=## SELECT query, calls, total_exec_time, rows, 100.0 * shared_blks_hit /

nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent

FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;

-[ RECORD 1 ]---+--------------------------------------------------

query | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid

= $2

calls | 3000

total_exec_time | 20756.669379

rows | 3000

hit_percent | 100.0000000000000000

-[ RECORD 2 ]---+--------------------------------------------------

query | copy pgbench_accounts from stdin

calls | 1

total_exec_time | 291.865911

rows | 100000

hit_percent | 100.0000000000000000

-[ RECORD 3 ]---+--------------------------------------------------

query | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid

= $2

calls | 3000

total_exec_time | 271.232977

rows | 3000

hit_percent | 98.8454011741682975

-[ RECORD 4 ]---+--------------------------------------------------

query | alter table pgbench_accounts add primary key (aid)

calls | 1

total_exec_time | 160.588563

rows | 0

hit_percent | 100.0000000000000000

-[ RECORD 5 ]---+--------------------------------------------------

query | vacuum analyze pgbench_accounts

calls | 1

total_exec_time | 136.448116

rows | 0

hit_percent | 99.9201915403032721

bench=## SELECT pg_stat_statements_reset(0,0,0);

bench=## SELECT query, calls, total_exec_time, rows, 100.0 * shared_blks_hit /nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;

-[ RECORD 1 ]---+--------------------------------------------------

query | SELECT pg_stat_statements_reset(0,0,0)

calls | 1

total_exec_time | 0.189497

rows | 1

hit_percent |

-[ RECORD 2 ]---+--------------------------------------------------

query | SELECT query, calls, total_exec_time, rows, $1 *

shared_blks_hit / +

| nullif(shared_blks_hit + shared_blks_read, $2)

AS hit_percent+

| FROM pg_stat_statements ORDER BY total_exec_time

DESC LIMIT $3

calls | 0

total_exec_time | 0

rows | 0

hit_percent |