Monitoring Amazon Redshift Environments with System Tables

Amazon Redshift offers a comprehensive suite of database monitoring tables and performance views that provide valuable insights into database activity and performance. Database administrators (DBAs) can leverage these resources to monitor various aspects of their Redshift environment, such as query performance, storage usage, workload management (WLM) queue efficiency, and security. This article presents a comprehensive guide to using these tools for effective environment monitoring.

Query Performance Monitoring

Efficient query execution is crucial for any data warehouse. Redshift provides several database monitoring tables and views to help DBAs analyze and optimize query performance.

STL_QUERY 1

This database monitoring table captures execution details for every query run on the Redshift cluster. DBAs can use this table to identify long-running queries, track query execution time, and analyze query patterns. It’s important to note that query ID values in this and other system tables may be reused over time.

Column Name Data Type Description
query integer The unique identifier for the query.
starttime timestamp The time when the query started.
endtime timestamp The time when the query ended.
elapsed bigint The total execution time of the query in milliseconds.
aborted boolean Indicates whether the query was aborted.

Example Queries:

— Find long-running queries that took more than 1 hour to complete
SELECT query, starttime, endtime, elapsed
FROM stl_query
WHERE elapsed > 3600000; — Time in milliseconds

— Find all queries run by a specific user in the last 24 hours
SELECT *
FROM stl_query
WHERE userid = 100 — Replace with the actual user ID
  AND starttime >= CURRENT_TIMESTAMPINTERVAL ‘1 day’;

STL_EXPLAI

This table stores the execution plan for each query. By examining the execution plan, DBAs can identify potential bottlenecks, such as full table scans or nested loop joins, and optimize query performance.

Column Name Data Type Description
query integer The unique identifier for the query.
nodeid integer The identifier for the node in the execution plan.
parent integer The identifier for the parent node in the execution plan.
plannode varchar The type of operation performed by the node (e.g., Seq Scan, Hash Join).

Example Queries:

— Find queries with full table scans
SELECT query, plannode
FROM stl_explain
WHERE plannode LIKE ‘%Seq Scan%’;

— Find queries with nested loop joins
SELECT query, plannode
FROM stl_explain
WHERE plannode LIKE ‘%Nested Loop%’;

SVL_QUERY_SUMMARY

This system view provides a summary of query execution details at the step level. DBAs can use this view to analyze the time taken and resources consumed by each step of a query.

Column Name Data Type Description
query integer The unique identifier for the query.
stm integer The statement number within the query.
seg integer The segment number within the statement.
step integer The step number within the segment.
elapsed numeric(12,3) The elapsed time for the step in milliseconds.

Example Query:

— Analyze the execution time of each step for a specific query
SELECT query, stm, seg, step, elapsed
FROM svl_query_summary
WHERE query = <query_id>
ORDER BY stm, seg, step;


STL_ALERT_EVENT_LOG

This table records alerts generated by the query optimizer when potential performance issues are detected. DBAs can use this table to identify opportunities for query optimization.

Column Name Data Type Description
query integer The unique identifier for the query.
event_time timestamp The time when the alert was generated.
alert_id smallint The identifier for the type of alert.
solution varchar The suggested solution to address the performance issue.

Example Queries:

— Find alerts related to missing statistics
SELECT query, event_time, alert_id, solution
FROM stl_alert_event_log
WHERE solution LIKE ‘%ANALYZE command%’;— Find alerts related to data skew
SELECT query, event_time, alert_id, solution
FROM stl_alert_event_log
WHERE solution LIKE ‘%data skew%’;

stv_exec_state and SVV_TRANSACTIONS

The stv_exec_state system table provides a snapshot of the current execution state of all queries running on the Redshift cluster. DBAs can use this table to monitor the progress of queries and identify any queries that might be stalled or experiencing delays. The SVV_TRANSACTIONS system view provides information about active transactions, including transaction ID, status, and start time. DBAs can use this view to monitor transaction activity and identify any long-running or blocked transactions.

It’s important to note that when querying these and other system tables, DBAs should be aware of the consistency model. System tables are not included in automated backups, and system views do not use the same consistency model as regular tables. This can lead to inconsistencies, especially for STV tables and SVV views.

Example Queries:

— Monitor the progress of all running queries
SELECT query, pid, slice, segment, step, rows
FROM stv_exec_state;

— Identify long-running transactions
SELECT xid, pid, status, start_time
FROM SVV_TRANSACTIONS
WHERE start_time < CURRENT_TIMESTAMPINTERVAL ‘1 hour’;

Storage Monitoring

Efficient storage utilization is essential for managing costs and ensuring optimal query performance. Redshift provides several tools to help DBAs monitor and manage storage.

SVV_DISKUSAGE

This system view provides information about disk space usage for each table in the database. DBAs can use this view to identify tables consuming excessive disk space and optimize storage utilization.

Column Name Data Type Description
database name The name of the database.
schema name The name of the schema.
name name The name of the table.
size bigint The size of the table in MB.
tbl_rows bigint The total number of rows in the table.

Example Queries:

— Find the top 10 largest tables in the database
SELECT name, size, tbl_rows
FROM svv_diskusage
ORDER BY size DESC
LIMIT 10;

— Find tables with more than 50% deleted rows
SELECT name, size, deleted_rows
FROM svv_diskusage
WHERE deleted_rows > (tbl_rows / 2);

SVV_TABLE_INF

This system view provides detailed information about tables, including table size, sort keys, distribution styles, and statistics. DBAs can use this view to identify tables with data skew or unsorted rows, which can impact query performance. It can also be used to identify tables with inefficient sort keys (skew_sortkey1) and outdated statistics (stats_off) .

Column Name Data Type Description
“table” name The name of the table.
size bigint The size of the table in 1 MB data blocks.
encoded varchar(3) Indicates whether any column has defined encoding compression (Y/N).
diststyle varchar(15) The distribution style of the table (e.g., EVEN, KEY).
sortkey1 varchar(128) The first column in the sort key.
sortkey_num smallint The number of columns defined as sort keys.
pct_used real The percentage of available space used by the table.
unsorted real The percentage of unsorted rows in the table.
stats_off integer Number indicating how outdated the table statistics are (0 is up-to-date, 100 is outdated).
tbl_rows bigint The total number of rows in the table.
skew_sortkey1 real Ratio between the size of the largest non-sort key column and the size of the first column in the sort key.
skew_rows real Ratio between the number of rows in the slice with the most rows and the number of rows in the slice with the fewest rows.

Example Queries:

— Find tables with high data skew
SELECT “table”, skew_sortkey1, skew_rows
FROM svv_table_info
WHERE skew_sortkey1 > 10 OR skew_rows > 10;

— Find tables with outdated statistics
SELECT “table”, stats_off
FROM svv_table_info
WHERE stats_off > 10;


— Find tables with inefficient sort keys
SELECT “table”, sortkey1, skew_sortkey1
FROM svv_table_info
WHERE skew_sortkey1 > 5;

Workload Management (WLM) Monitoring

Workload Management (WLM) allows DBAs to define separate queues for different types of queries and control resource allocation. Redshift provides system tables and views to monitor WLM queue performance.

STV_WLM_QUERY_STATE

This system view provides real-time information about the state of queries in WLM queues. DBAs can use this view to monitor queue wait times and identify bottlenecks in query execution.

Column Name Data Type Description
query integer The unique identifier for the query.
service_class smallint The service class (queue) where the query is running.
state varchar(15) The current state of the query (e.g., running, queued).
queue_start_time timestamp The time when the query entered the queue.

Example Queries:

— Find queries waiting in the queue for more than 5 minutes
SELECT query, service_class, state, queue_start_time
FROM stv_wlm_query_state
WHERE state = ‘queued’ AND queue_start_time < CURRENT_TIMESTAMPINTERVAL ‘5 minutes’;

— Find the number of running and queued queries in each service class
SELECT service_class, state, COUNT(*) AS query_count
FROM stv_wlm_query_state
GROUP BY service_class, state;

STL_WLM_QUERY

This system table stores historical information about query execution in WLM queues. DBAs can use this table to analyze queue performance and identify trends in query execution.

Column Name Data Type Description
query integer The unique identifier for the query.
service_class smallint The service class (queue) where the query ran.
queue_start_time timestamp The time when the query entered the queue.
queue_wait_time bigint The time the query spent waiting in the queue (in milliseconds).
execution_time bigint The total execution time of the query (in milliseconds).

Example Queries:

— Analyze the average queue wait time for queries in a specific service class
SELECT service_class, AVG(queue_wait_time) AS avg_queue_wait_time
FROM stl_wlm_query
WHERE service_class = <service_class_id>
GROUP BY service_class;

— Find queries with the longest queue wait times in the last hour
SELECT query, service_class, queue_wait_time
FROM stl_wlm_query
WHERE queue_start_time >= CURRENT_TIMESTAMPINTERVAL ‘1 hour’
ORDER BY queue_wait_time DESC;

Troubleshooting Performance Issues

Performance issues in Redshift can manifest in various ways, such as slow queries, high CPU usage, or excessive disk I/O. DBAs can use the system tables and views discussed earlier to diagnose these issues and identify potential solutions.

Example Scenarios:

  • Slow Queries: If a query is running slower than expected, DBAs can use STL_EXPLAIN to analyze the query plan and identify potential bottlenecks like full table scans or nested loop joins. They can also use SVL_QUERY_SUMMARY to analyze the execution time of each step in the query and pinpoint the steps that are causing delays.
  • High CPU Usage: If the CPU utilization of the Redshift cluster is consistently high, DBAs can use STL_QUERY to identify long-running queries or queries that consume excessive CPU resources. They can then optimize these queries or adjust WLM queue configurations to manage resource allocation.
  • Excessive Disk I/O: If the disk I/O is high, DBAs can use SVV_DISKUSAGE and SVV_TABLE_INFO to identify tables that are consuming excessive disk space or have a high percentage of unsorted rows. They can then optimize table design, run VACUUM operations, or adjust data distribution strategies to improve disk I/O performance.

By combining information from different system tables and views, DBAs can gain a comprehensive understanding of the factors contributing to performance issues and take appropriate actions to resolve them.

Security Monitoring

Maintaining the security of the Redshift environment is paramount. Redshift provides tools to help DBAs monitor user activity and identify potential security threats.

STL_CONNECTION_LOG

This system table logs all connection attempts to the Redshift cluster. DBAs can use this table to monitor user login activity and identify any suspicious or unauthorized access attempts.

Column Name Data Type Description
userid integer The ID of the user who attempted to connect.
recordtime timestamp The time of the connection attempt.
event varchar(30) The type of connection event (e.g., successful, failed).
remotehost varchar(256) The hostname or IP address of the client.

Example Queries:

— Find failed connection attempts
SELECT userid, recordtime, event, remotehost
FROM stl_connection_log
WHERE event = ‘failed’;

— Find connection attempts from suspicious IP addresses
SELECT userid, recordtime, event, remotehost
FROM stl_connection_log
WHERE remotehost IN (‘192.168.1.100’, ‘10.0.0.10’); — Replace with actual suspicious IP addresses

PG_USER

This system catalog table stores information about database users. DBAs can use this table to review user permissions and ensure that users have appropriate access privileges.

Column Name Data Type Description
usename name The name of the user.
usecreatedb boolean Indicates whether the user has permission to create databases.
usesuper boolean Indicates whether the user is a superuser.

Example Query:

— List all users and their permissions
SELECT usename, usecreatedb, usesuper
FROM pg_user;

STL_QUERYTEXT

This system table stores the text of SQL queries executed on the cluster. DBAs can use this table to audit user activity and identify any potentially harmful or unauthorized queries.

Column Name Data Type Description
query integer The unique identifier for the query.
querytxt varchar The text of the SQL query.

Example Queries:

— Find queries that attempt to drop tables
SELECT query, querytxt
FROM stl_querytext
WHERE querytxt LIKE ‘%DROP TABLE%’;

— Find queries that access sensitive data
SELECT query, querytxt
FROM stl_querytext
WHERE querytxt LIKE ‘%customer_pii%’; — Replace with the actual sensitive data identifier

Other System Tables and Views

In addition to the tables and views discussed above, Redshift provides several other system tables and views that can be useful for monitoring and managing the environment.

STL_WLM_ Tables

The system tables with the STL_WLM_ prefix provide detailed information about WLM activities, such as queue configurations, resource usage, and query routing. DBAs can use these tables to analyze WLM performance and troubleshoot any issues related to workload management.

STV_WLM_ Tables

The system tables with the STV_WLM_ prefix provide real-time snapshots of WLM activities, such as the current state of queues and the queries running in each queue. DBAs can use these tables to monitor WLM performance and identify any bottlenecks in real-time.

PG_TABLE_DEF

This system catalog table provides information about the definitions of tables, including column names, data types, and constraints. DBAs can use this table to understand the structure of tables and identify potential issues with table design.

PG_NAMESPACE

This system catalog table stores information about schemas (namespaces) in the database. DBAs can use this table to manage schemas and understand the organization of database objects.

SVV_REDSHIFT_TABLES

This system view provides information about tables in the database, including table type (e.g., table, view) and any associated comments. DBAs can use this view to get an overview of the tables in the database and understand their purpose.

Limitations and Considerations

While system tables and views are valuable resources for monitoring, DBAs should be aware of their limitations:

  • Backups: System tables are not included in automated backups 2.
  • Consistency: System views may not always reflect the most up-to-date information due to the consistency model used 2.
  • Resource Usage: Querying system tables can consume system resources, so it’s important to use appropriate filters and limit the amount of data retrieved.

Synthesis

Amazon Redshift system tables and views offer a powerful mechanism for DBAs to monitor and manage their database environments effectively. By leveraging the information available in these resources, DBAs can gain valuable insights into query performance, storage usage, WLM queue efficiency, and security. Regular monitoring and analysis of these tables can help DBAs proactively identify and address potential issues, ensuring optimal performance and security of their Redshift environments.

DBAs should incorporate system table monitoring into their regular Redshift management practices. This includes:

  • Establishing Baseline Performance: Monitor key metrics over time to understand the typical behavior of the Redshift environment.
  • Setting Alerts: Configure alerts for critical metrics to be notified of potential issues proactively.
  • Analyzing Trends: Regularly analyze historical data from system tables to identify trends and potential areas for optimization.
  • Using Automation: Automate the collection and analysis of data from system tables to streamline monitoring efforts.

By following these recommendations, DBAs can effectively leverage the power of Redshift system tables and views to ensure the health, performance, and security of their data warehouse environments.

Works cited

  1. Redshift STL Views for Logging – DataSunrise, accessed February 20, 2025, https://www.datasunrise.com/knowledge-center/redshift-stl-views-for-logging/
  2. System tables and views reference – Amazon Redshift, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/cm_chap_system-tables.html
  3. Amazon-Redshift Guide For Data Analyst Cluster Management – RudderStack, accessed February 20, 2025, https://www.rudderstack.com/guides/amazon-redshift-guide-for-data-analyst-cluster-management-system-tables-views/
  4. Using the SVL_QUERY_SUMMARY view – Amazon Redshift – AWS Documentation, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/using-SVL-Query-Summary.html
  5. SVV_DISKUSAGE – Amazon Redshift, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_DISKUSAGE.html
  6. SVV_TABLE_INFO – Amazon Redshift – AWS Documentation, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html
  7. Catalog Tables in Redshift: Enhancing Governance of Your Data Warehouse. – Medium, accessed February 20, 2025, https://medium.com/@alice_thomaz/catalog-tables-in-redshift-enhancing-governance-of-your-data-warehouse-ee03daf5bcad
  8. STV_WLM_QUERY_STATE – Amazon Redshift – AWS Documentation, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_STV_WLM_QUERY_STATE.html
  9. STL_WLM_QUERY – Amazon Redshift – AWS Documentation, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_STL_WLM_QUERY.html
  10. STL_QUERYTEXT – Amazon Redshift, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_STL_QUERYTEXT.html
  11. PG_TABLE_DEF – Amazon Redshift, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_PG_TABLE_DEF.html
  12. SVV_REDSHIFT_TABLES – Amazon Redshift – AWS Documentation, accessed February 20, 2025, https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_REDSHIFT_TABLES.html
Author: Maninder