PostgreSQL includes a number of system catalog tables that you can use to find out information about your database cluster. I've always referred to them as Data Dictionary tables because that's what they would be called in Oracle and Oracle is the first database system I learned. PostgreSQL also includes a number of stored procedures or functions that combine the information in some of these tables and present the results in a more easily readable form. Monitoring PostgreSQL is done by querying these tables or running these functions. Here are some of the common ones that I have used to help monitor PostgreSQL:
Monitoring Current Database Commands
When things start slowing down and traffic backs up it is good to be able to see what queries or commands are currently running. Perhaps there is a query that really slows everything down or a large number of statements are being executed all at once. Looking at the pg_stat_activity view will show you what is running along with some important information about the statements running. If you describe the pg_stat_activity view using the \d command in psql, you get the following output:View "pg_catalog.pg_stat_activity"
Column | Type | Modifiers
------------------+--------------------------+-----------
datid | oid |
datname | name |
pid | integer |
usesysid | oid |
usename | name |
application_name | text |
client_addr | inet |
client_hostname | text |
client_port | integer |
backend_start | timestamp with time zone |
xact_start | timestamp with time zone |
query_start | timestamp with time zone |
state_change | timestamp with time zone |
waiting | boolean |
state | text |
query | text |
The last column in the view is named query but will also show other running SQL statements like INSERTS or DELETES. Other columns of note include the pid column and is the process id of the statement. This allows you to go out to the operating system and use tools like ps to find more information about a specific statement that is running.
Running the command on a development server with no current traffic gives the following results:
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | clie
nt_port | backend_start | xact_start | query_start
| state_change | waiting | state | query
-------+----------+------+----------+----------+------------------+-------------+-----------------+-----
--------+-------------------------------+-------------------------------+-------------------------------
+------------------------------+---------+--------+---------------------------------
12914 | postgres | 3389 | 16391 | mbennett | psql | | |
-1 | 2014-02-19 09:05:51.377371-07 | 2014-02-19 09:10:28.021888-07 | 2014-02-19 09:10:28.021888-07
| 2014-02-19 09:10:28.02189-07 | f | active | select * from pg_stat_activity;
(1 row)
Listing everything in a single line makes it rather difficult to read but this gives you an idea of what your output will look like. For a larger server running in a high-traffic production environment, you will get a number of rows output and so you may only want to see those columns to help you diagnose problems. They may include pid, application_name, query_start, and query.
Calculating the Size of Your Database
Unix and Linux do not behave very well when you fill the disk. Therefore it is very important to make sure you keep a close eye on how large your database is to ensure you don't run out of space. There is a database function named pg_database_size to help see the size of your database. You just need to run the following query:SELECT pg_database_size('database_name');
PostgreSQL can have multiple databases running on the same server. You can get a list of all database by doing a \l in psql. Be sure to replace the string database_name with the name of your actual database. The result is output in bytes.
Calculating the Size of a Specific Table
Knowing the size of your database is helpful to keep from running out of disk space. Knowing the size of a specific table will help with increasing performance. Once a table grows above a specific size, indexes will help reduce query times. You may also want to consider partitioning for larger tables. Finding the table size can be done using the pg_total_relation_size function and can be done using the follow command:SELECT pg_total_relation_size('table_name');
Be sure to replace table_name with the actual name of the table for which you are interested. Once again the result is in total bytes.
Finding the Largest Tables in Your Database
The pg_total_relation_size function is good if you know which of your tables are large. What do you do if you are new to a project and have not had time to have someone explain which tables are your largest? How do you find that out? There is a view called pg_class that contains that information and finding out the top 10 largest tables is as simple as using the following query:SELECT relname, relpages
FROM pg_class
ORDER BY relpages DESC
LIMIT 10;
Changing the LIMIT clause will allow you to see more or fewer tables. Leaving it off completely shows all of the tables in your database including the system catalog or data dictionary tables and views. The relpages column shows how many database pages are used by each of the tables and so it is a relative size as opposed to the number of bytes. You can use the pg_total_relation_size function to see the number of bytes used by the table, as shown above.
Checking for Uncommitted Transaction Locks
In high-volume transaction environments you there is a chance that you will run into deadlocks. Being able to look at the pg_locks view will help you figure out locking issues. Describing the view with the \d command in psql gives the following:View "pg_catalog.pg_locks"
Column | Type | Modifiers
--------------------+----------+-----------
locktype | text |
database | oid |
relation | oid |
page | integer |
tuple | smallint |
virtualxid | text |
transactionid | xid |
classid | oid |
objid | oid |
objsubid | smallint |
virtualtransaction | text |
pid | integer |
mode | text |
granted | boolean |
fastpath | boolean |
The column we are most interested in viewing is granted. If the value of granted is true, the lock has been granted. If the value is false, the query is waiting for a lock and provides a clue as to what might be causing a problem.
No comments:
Post a Comment