PostgreSQL #7 - Tuning PART 1 - Where to start ?
PostgreSQL #7- Tuning Part 1
It`s been a while since my last post (to be accurate half a year) but I'm back and I got a lot of things to talk about. In the next few posts I decided to focus on Tuning. Performance issues are something that everyone has encountered in some way. It can be a very slow application, a query that is running for a long time and even a very slow server.
I think that in the next chapters I will focus on some advanced topics. If you are asking yourself some of the next questions then I think that this post and the next ones might be useful for you :
-Where should we start from ?
-Is it a database issue or an application issue ?
-On what part of the database should we focus ?
-What is an execution plan and how can I understand it ?
-What are indexes and when to use them ? What type of indexes there are ?
-Why the database doesn't use an index that I created ?
-Does the order of the columns in the index has any meaning ?
-What parameters should we change ?
-Why the table is so huge ?
-Why the planner decides to use this plan ? How can I force him to use another ?
-Why this query takes so much time ?
-What are locks ? deadlocks ? How should we handle them ?
In this part I'm going to explain How I think that we should start investigating a performance issue on our database. The first step is gathering data. We should investigate and learn the database`s behavior and its content.
When I access a database for the first time the first place I'm looking for is the log directory of the database. If you didn't change the log_directory parameter then its default value is 'pg_log' and the full path should be /var/lib/pgsql/data/pg_log/ . In that directory you should see logs from the last 7 days(unless you changed the rotation settings - see Documentation). Those logs can be super useful for you to investigate performance issues. However, by default these logs don`t contain a lot of interesting data. Therefore, I recommend to change the next parameters :
The database will log in the log files every query that took more then 200 milliseconds to finish. If you think that 200 ms is a very low number, think about a query that runs very often. Sometimes, queries that run very often are the root cause behind a weird performance problem.
2)log_checkpoints
Postgres will write a message in the log files every time a checkpoint is occurring. It is important to know how often data is written to our data files. Sometimes data is written very often to the data files which means a lot of I/O operations.
3)log_autovacuum_min_duration=0
This parameter allows us to keep track of the autovacuum process.
4)log_temp_files=1024
Connections that need more memory for their queries will use disk temp files. All log files that are generated and bigger then 1024KB will be logged in the database`s log files.
5)If downtime is something that is possible at this moment I recommend installing the extension pg_stat_statements. You should connect to one of your databases and run : create extension pg_stat_statements. You can read about it in the documentation but basically it tracks and gathers statistics on sql statements. In addition, you need to set the next 2 parameters in the postgresql.conf file :
shared_preload_libraries = 'pg_stat_statements'track_activity_query_size = 32764
track_activity_query_size - "Specifies the number of bytes reserved to track the currently executing command for each active session, for the pg_stat_activity.query field. The default value is 1024. This parameter can only be set at server start." (Copied form Documentation). I set this value to be pretty high in order to keep track of the query in case Hibernate is used. Some applications use Hibernate as their ORM and in hibernate the queries can be very long (specifies all columns instead of *, and a lot of parameters might be used in one query).
After all those changes, you should reload/restart (depends if you are planing to use pg_stat_statements) you database :
service postgresql reload/restart
Let the database work for a while and fill those logs for you. If you can let the database work for a day or two then it will be great. This time will allow the pg_stat_statements extension to gather more data and then we can get a better picture on the queries. Either way, there is still a lot of the data that we can gather.
Run the next query to see the databases in the cluster and their size :
select datname,pg_size_pretty(pg_database_size(datname)) as db_size from pg_database;
In the picture you can see that mariel_db is the biggest. Then it worth a shot to focus on it first, but it doesn't necessary means that we have performance issues there.
In the biggest or main or most active databases we can check for the biggest relations(I recommend you to run it on all your databases) :
select relname,pg_relation_size(oid)/1024/1024 as table_size_MB,(pg_total_relation_size(oid)-pg_relation_size(oid))/1024/1024 as indexes_size_MB from pg_class where relkind!='i' order by table_size_MB desc;
Try to see the structure of the biggest tables(\d). The purpose here is to learn the database`s objects and to see what object can potentially cause performance issues. Check if there are foreign keys to those tables, indexes, constraints and so on..
Next thing that we can check is the active sessions. The table that will help you with that are pg_stat_activity. You can monitor it for a few minutes (with a script or manualy) and see how many sessions are opened to the database and how many of them are active :
In addition, you can check what queries are waiting for something or even if there are long running queries :
select * from pg_stat_activity where wait_event is not null;
select * from pg_stat_activity where now()-query_start > interval '2 minutes';
Another thing that you can check is if there are any locks in the database :
select * from pg_locks where not granted;
(in my case I didnt have any...)
The last thing that I recommend you to check is the statistics gathered for all the biggest tables in your database. The pg_stat_all_tables contains a lot of data(statistics) on tables. In other words you can see how the database is looking on the tables and try to understand some of the planner decisions. The next query might be useful fur you :
select * from pg_stat_all_tables as stats,(select relname,pg_relation_size(oid)/1024/1024 as table_size_MB,(pg_total_relation_size(oid)-pg_relation_size(oid))/1024/1024 as indexes_size_MB from pg_class where relkind!='i' order by table_size_MB desc) as big_tables where stats.relname=big_tables.relname;
To sum up, I didn't focus in this post on any advanced topics,common performance issues or anything like that. The purpose of this post, was to show you that how to approach in the first time a database that has performance issues. In the next posts I will focus on more advanced topics like indexes,joins and memory parameters. If you have any questions, feel free to send it to my mail or comment to this post.
תגובות
הוסף רשומת תגובה