Helpful Admin Queries

You have a running Redshift cluster. You've loaded data and run queries. Now you want to know what's going on inside the cluster. Which queries are running? Which nodes are getting full? Are some queries taking way too long because they're not using the correct sort or dist keys?

The AWS Redshift console is helpful, but it doesn't have all the answers. The system tables in your Redshift cluster have data about actively running and recently completed queries, table sizes and node capacity, and overall performance metrics.

In this section we'll show you how to query Redshift's system tables to answer these kinds of questions.

Redshift-Specific System Tables

It is easy to treat Redshift as a black box — queries go in, answers come out. When something goes wrong, though, you’ll want to open the hood and see what Redshift is actually doing.

To dig into any issues, each Redshift cluster provides virtual system tables you can query. Like Postgres, Redshift has the information_schema and pg_catalog tables, but it also has plenty of Redshift-specific system tables.

All Redshift system tables are prefixed with stl_, stv_, svl_ or svv_.

  • The stl_ prefix denotes system table logs. stl_ tables contain logs about operations that happened on the cluster in the past few days.

  • The stv_ prefix denotes system table snapshots. stv_ tables contain a snapshot of the current state of the cluster.

  • The svl_ prefix denotes system view logs. svl_ views join some number of system tables to provide more descriptive info.

  • The svv_ prefix denotes system view snapshots. Like svl_, the svv_ views join some system tables to provide more descriptive info.

Current Cluster Status

One of the most common reasons to log into the Redshift console is to kill a misbehaving query. To find which queries are currently in progress, check the stv_inflight table.

  , query
  , pid
  , starttime
  , left(text, 50) as text
from stv_inflight

You'll end up with a table like this:

To kill a query, use the cancel <pid> <msg> command. Be sure to use the process id — pid in the table above — and not the query id. You can supply an optional message which will be returned to the issuer of the query and logged.

Redshift also stores the past few days of queries in svl_qlog if you need to go back further. The stv_recents view has all recently queries with their status, duration, and pid for currently-running queries.

All of these tables only store the first 200 characters of each query. The full query is stored in chunks in stl_querytext. Join this table in by query, and sort by query_id and sequence to get each 200 character chunk in order.

from stl_query 
join stl_querytext using (query) 
order by query,sequence 
limit 5;


If your cluster has a suspiciously long-running update, it may be in a deadlocked transaction. The stv_locks table will indicate any transactions that have locks along with the process id of the relevant sessions. This pid can be passed to pg_terminate_backend(pid) to kill the offending session.

To inspect the locks, order them by oldest first.

from stv_locks 
order by last_update asc

To terminate the session, run select pg_terminate_backend(lock_owner_pid), using the value from stl_locks.

Connection Issues

Debugging connection issues is never fun. Luckily, Redshift has a few tables that make up for the lack of a network debugging tool.

The stv_sessions table lists all the current connection, similar to Postgres's pg_stat_activity. While useful, it doesn’t have the actual connection information for host and port. That can be found in stl_connection_log. This table has a list of all connects, authenticates, and disconnects on your cluster. Joining these tables returns a list of sessions and remote host information.

select distinct
  , process
  , user_name
  , '' remotehost
  , remoteport
from stv_sessions
left join stl_connection_log on pid = process
  and starttime > recordtime - interval '1 second'
order by starttime desc

You'll get a table like this:

Query Performance


The stl_alert_event_log table is important for optimizing queries. When the cluster executes your query, it records problems found by the query planner into stl_alert_event_log along with suggested fixes. Some problems can be fixed by running analyze or vacuum , while others might require rewriting the query or changing your schema.


svv_table_info returns extended information about state on disk of your tables. This table can help troubleshoot low-performing tables. While we recommend regular vacuuming and other maintenance, you can also use this table as a guide for when to vacuum.

Here are the column names you'll see in the svv_table_info table:

  • empty shows how many blocks are waiting to be freed by a vacuum.
  • unsorted shows the percent of the table that is unsorted. The cluster will need to scan this entire section for every query. You need to vacuum to re-sort and bring this back to 0.
  • sortkey1_enc lists the encoding of the first sortkey. This can sometimes affect lookup performance.
  • skew_sortkey1 shows the ratio of the size of the first column of the sortkey to the size of the largest non-sortkey column, if a sortkey is defined. You can use this value to evaluate the effectiveness of the sortkey.
  • skew_rows shows the ratio of rows from most on a slice to least on a slice. Use it to evaluate distkey.
  • max_varchar show the size of the largest varchars. While varchars compress well, they can force a temporary result which otherwise fits in ram to be stored on disk, reducing query performance.

For help on changing the sort and dist keys of your giant tables, check out this post.

Copying Tables

If you want to copy or split a table, Redshift supports both create table like, and create table as syntax.

create table like copies the structure, compression, distribution and sortkey. This is great for archiving tables as it keeps the compression settings.

create table as creates a table and fills it with the given query. You can supply optional sort and distkeys. Note that this won’t compress the table, even if the source tables are compressed. create table as is best for small temporary tables, as compression helps with performance for an upfront cost.

create table events_201404 as (
  select * 
  from events 
  where created_at >= '2014-04-01' and created_at < '2014-05-01'

create table events_201404 like events;
insert into events_201404 (
  select * 
  from events 
  where created_at >= '2014-04-01' and created_at < '2014-05-01'

To create a compressed table from a query after using create table as, run analyze compression, create a new table with those encodings, and then copy the data into the new table. Alternatively, unload the data somewhere, and load it back with copy.

Further Reading

For more tips on Redshift maintenance, take a look at our Maintenance Section. We hope this gives you a solid base for troubleshooting Redshift issues!

Next Section: Managing Query Load