PostgreSQL settings for Nuxeo DM

Your PostgreSQL database needs to be configured to work properly with Nuxeo DM.

Note that to make Nuxeo DM work with PostgreSQL you have to:

CREATE LANGUAGE 'plpgsql';

This FAQ will give you some hints to configure your database, please refer to your DBA or the PostgreSQL documentation for more information (http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server).

Of course the first advice is to use the latest stable PostgreSQL 8.3 version. You can use PostgreSQL 8.4 if you have a Nuxeo DM >= 5.3.1 with a VCS backend.

Two phase commit

Nuxeo uses two phase commit and needs to have the max_prepared_transactions settings updated.

This change is mandatory for PostgreSQL 8.4 since it is disabled by default. If you don't change this option you will have javax.transaction.HeuristicMixedException exceptions.

max_prepared_transactions = 64

Shared buffers and system cache

One of the most important thing for PostgreSQL is to have lots of shared buffers along with free memory that can be used by the system cache.

If you plan to use 1g of shared buffers, update the following property in your postgresql.conf file:

shared_buffers = 1GB

The shared memory is dedicated to PostgreSQL and must be available on the system side using sysctl. You need to enable a little bit more at the os level, for instance try 1g + 128m:

sysctl kernel.shmmax=1207959552

Then restart the PostgreSQL, if there is no enough shared memory you will have an explicit error message and you should try with a bigger kernel.shmmax value.

Once PostgreSQL is started the retained shmmax value, should be registered in the /etc/sysctl.conf file by adding the following line.

kernel.shmmax = <SHMMAX_VALUE>

PostgreSQL need to know how much memory the system will use for disk caching, this is used as a hint when executing queries, this memory is not allocated by PostgreSQL.

To set effective_cache_size value you need to run your application once and check how much memory is used by system cache. This can be done using the free command and by summing buffers and chached values. The value is shared_buffers plus the amount of OS cache.

effective_cache_size = 2GB

Memory for workers

Increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts wich is much faster than disk sorts. Have in mind that work_mem size will be taken by each connection (a pool of 20 connections will take 20 * work_mem).

work_mem = 32MB

Increasing the maintenance_work_mem will speed up the vacuum procedure.

maintenance_work_mem = 128MB

Buffering writes

The default wal_buffers can be increase to improve write access time:

wal_buffers = 8MB

Mass import specific tuning

When doing mass import you can disable the fulltext trigger and fulltext index. They must be dropped after a successful login on a running Nuxeo DM because DDL SQL commands are executed on the first access.

ALTER TABLE fulltext DISABLE TRIGGER nx_trig_ft_update;
DROP INDEX IF EXISTS fulltext_fulltext_idx;

After the import you can update the fulltext column like this:

ALTER TABLE fulltext ENABLE TRIGGER nx_trig_ft_update;
-- Let the trigger update the fulltext column
UPDATE fulltext SET fulltext = ''::TSVECTOR WHERE length(fulltext) is NULL;
CREATE INDEX fulltext_fulltext_idx ON fulltext USING gin (fulltext);

Changing temporary the PostgreSQL configuration during the import provide performance benefits:

checkpoint_segments = 16
full_page_writes = off
fsync = off
synchronous_commit = off

Please refer to the PostgreSQL reference manual http://www.postgresql.org/docs/8.3/interactive/runtime-config-wal.html

Index vs table scan

The ``random_page_cost`` parameter influences this query planner's choice. The default value 4 is too high and can result in a wrong bet on large table. you can lower the cost to 2.

random_page_cost = 2;

Updating the planner statistics

PostgreSQL computes statistics on table content in order to plan for the best performance when executing queries with joins and complex filters. The default configuration in PostgreSQL <= 8.3 is 'default_statistics_target' set to the value 10 which can lead to not accurate enough estimates. In 8.4 this value is now set to 100 by default. To set it to 100 in 8.3 instances, just use:

set default_statistics_target = 100;

On a running instance you can check that settings with:

show default_statistics_target;

And then re-execute 'ANALYZE' to update the statistics.

Vacuumming

The autovacuum is enable by default since PostgreSQL 8.3. Exceptionally a full vacuum can be done at downtime to recover disk space, it should be followed with a reindexdb command.

Monitoring

See the PostgreSQL section of the Monitoring Nuxeo DM FAQ.

Reporting problems

If you have a database configuration problem and you want to fill a jira ticket, there are some information to report:

  • The PostgreSQL server state: is it dedicated or shared, which OS, virtual machine ...
  • How much memory is available on the database server (free -m output)
  • Amount of nuxeo documents and PostgreSQL configuration. Using the "psql" command line tool connect to your nuxeo database and execute the following commands:
\o /tmp/pgconf.txt
\timing
SELECT now(), Version();
SELECT current_database() AS db_name,  pg_size_pretty(pg_database_size(current_database())) AS db_size, pg_size_pretty(SUM(pg_relation_size(indexrelid))::int8) AS index_size FROM pg_index;
SELECT COUNT(*) AS documents_count FROM hierarchy WHERE NOT isproperty;
SELECT primarytype, COUNT(*) AS count FROM hierarchy WHERE NOT isproperty GROUP BY primarytype ORDER BY count DESC;
SELECT COUNT(*) AS hierarchy_count FROM hierarchy;
SELECT COUNT(*) AS aces_count FROM acls;
SELECT COUNT(DISTINCT(id)) AS acls_count FROM acls;
SELECT COUNT(*) AS read_acls_count FROM read_acls;
SELECT  stat.relname AS "Table",
    pg_size_pretty(pg_total_relation_size(stat.relid)) AS "Total size",
    pg_size_pretty(pg_relation_size(stat.relid)) AS "Table size",
    CASE WHEN cl.reltoastrelid = 0 THEN 'None' ELSE
        pg_size_pretty(pg_relation_size(cl.reltoastrelid)+
        COALESCE((SELECT SUM(pg_relation_size(indexrelid)) FROM pg_index WHERE indrelid=cl.reltoastrelid)::int8, 0)) END AS "TOAST table size",
    pg_size_pretty(COALESCE((SELECT SUM(pg_relation_size(indexrelid)) FROM pg_index WHERE indrelid=stat.relid)::int8, 0)) AS "Index size"
FROM pg_stat_all_tables stat
  JOIN pg_statio_all_tables statio ON stat.relid = statio.relid
  JOIN pg_class cl ON cl.oid=stat.relid AND stat.schemaname='public' 
ORDER BY pg_total_relation_size(stat.relid) DESC
LIMIT 20;
SELECT name, unit, current_setting(name), source FROM pg_settings WHERE source!='default';
SHOW ALL;
\q

and attach the output file located in "/tmp/pgconf.txt" into the jira ticket.

Version 28.1 last modified by Benoit Delbosc on 09/06/2010 at 13:09

Comments 0

No comments for this document

Attachments 0

No attachments for this document

Creator: Benoit Delbosc on 2009/07/10 14:34
© 2008-2010 Nuxeo
1.3.8295