.. _configuration: Configuration ============= The configuration file is in the JSON format. It consists of nested key-value pairs. For example:: { "json_state_file_path": "/var/lib/pghoard/pghoard_state.json" "backup_sites": { "mycluster": { "nodes": [ { "host": "127.0.0.1", "password": "secret", "port": 5432, "user": "backup", "slot": "pghoard" } ], "basebackup_count": 5, "basebackup_mode": "delta", "object_storage": { "storage_type": "local", "directory": "/tmp/pghoard/backups" } } } } Global Configuration -------------------- Global configuration options are specified at the top-level. In this documentation we group them by categories. Generic Configuration ~~~~~~~~~~~~~~~~~~~~~ active (default ``true``) Can also be set on the ``backup_site`` level to disable taking of new backups and to stop the deletion of old ones backup_location Where ``pghoard`` will create its internal data structures for local state data. hash_algorithm (default ``"sha1"``) The hash algorithm used for calculating checksums for WAL or other files. Must be one of the algorithms supported by Python's `hashlib `_ json_state_file_path (default ``"/var/lib/pghoard/pghoard_state.json"``) Location of the JSON state file path which describes the state of the ``pghoard`` process. maintenance_mode_file (default ``"/var/lib/pghoard/maintenance_mode_file"``) Trigger file for maintenance mode: if a file exists at this location no new backup actions will be started) FIXME: define "new backup actions" transfer (default see below) A JSON object defining the WAL/basebackup transfer parameters. Example:: { transfer: { thread_count: 4, upload_retries_warning_limit: 3 } } thread_count (default ``min(cpu_count + 3, 20)``) Number of parallel uploads / downloads upload_retries_warning_limit (default ``3``) Create an alert file ``upload_retries_warning`` after this many failed upload attempts. See (FIXME: link to alert system) tar_executable (default ``"pghoard_gnutaremu"``) The tar command to use for restoring basebackups. This must be GNU tar because some advanced switches like ``--transform`` are needed. If this value is not defined (or is explicitly set to ``"pghoard_gnutaremu"``), Python's internal tarfile implementation is used. The Python implementation is somewhat slower than the actual tar command and in environments with fast disk IO (compared to available CPU capacity) it is recommended to set this to ``"tar"``. restore_prefetch (default ``transfer.thread_count``) Number of files to prefetch when performing archive recovery. The default is the number of Transfer Agent threads to try to utilize them all. .. _configuration_logging: Logging configuration ~~~~~~~~~~~~~~~~~~~~~ log_level (default ``"INFO"``) Determines log level of ``pghoard``. syslog (default ``false``) Enable / disable syslog logging syslog_address (default ``"/dev/log"``) Determines syslog address to use in logging (requires syslog to be true as well) syslog_facility (default ``"local2"``) Determines syslog log facility. (requires syslog to be true as well) .. _configuration_monitoring: Monitoring ~~~~~~~~~~ alert_file_dir (default ``backup_location`` if set else ``os.getcwd()``) Directory in which alert files for replication warning and failover are created. stats (default ``null``) When set, enables sending to a statsd daemon that supports Telegrag or DataDog syntax with tags. The value is a JSON object, for example:: { "host": "", "port": , "format": "", "tags": { "": "" } } host The statsd host address port The statsd listening port format (default ``"telegraf"``) Determines statsd message format. Following formats are supported: - ``telegraf`` `Telegraf spec `_ - ``datadog`` `DataDog spec `_ :tags: (default null) The tag key can be used to enter optional tag values for the metrics push_gateway (default ``null``) When set, enables sending metrics to a Prometheus Pushgateway with tags. The value is a JSON object, for example:: { "endpoint": "", "tags": { "": "" } } endpoint The pushgateway address tags An object mapping tags to their values. .. _configuration_http: HTTP Server configuration ~~~~~~~~~~~~~~~~~~~~~~~~~ The pghoard daemon needs to listen on an HTTP port for the archive command and for fetching of basebackups/WAL's when restoring if not using an object store. http_address (default ``"127.0.0.1"``) Address to bind the PGHoard HTTP server to. Set to an empty string to listen to all available IPv4 addresses. Set it to the IPv6 ``::`` wildcard address to bind to all available IPv4 and IPv6 addresses. http_port (default ``16000``) HTTP webserver port. Used for the archive command and for fetching of basebackups/WAL's when restoring if not using an object store. .. _configuration_compression: Compression ~~~~~~~~~~~ The PostgreSQL write-ahead log (WAL) and basebackups are compressed with Snappy (default), Zstandard (configurable, level 3 by default) or LZMA (configurable, level 0 by default) in order to ensure good compression speed and relatively small backup size. For performance critical applications it is recommended to test compression algorithms to find the most suitable trade-off for the particular use-case. E.g. Snappy is fast but yields larger compressed files, Zstandard (zstd) on the other hand offers a very wide range of compression/speed trade-off. The top-level ``compression`` key allows to define compression options:: { "compression": { "algorithm": "snappy", "level": 3, "thread_count": 4 } } algorithm (default ``snappy``) The compression algorithm to use. Available algorithms are ``snappy``, ``zstd``, and ``lzma`` level (default ``0`` for ``lzma`` and ``zstd``, ``3`` for ``snappy``) The compression level to use. Depends on the algorithm used. thread_count (default to ``cpu_count`` + 1) The number of threads used for parallel compression. Contrary to ``basebackup_compression_threads`` this is the number of compression threads started by ``pghoard``, not internal compression threads for libraries supporting it, and is then applicable to any compression algorithm. Backup sites ------------ The key ``backup_sites`` contains configuration for groups of PostgreSQL clusters (here called ``sites``). Each backup site configures how to backup the different nodes it comprises. Each site can be configured separately, under an idenfiying site name (example: ``mysite``). A backup site contains an array of at least one node. For each node, the connection information is required. The keys for a node are libpq parameters, for example:: { "backup_sites": { "mysite": { "nodes": [ { "host": "127.0.0.1", "password": "secret", "port": 5432, "user": "backup", "slot": "pghoard", "sslmode": "require" } ] } } } It is advised to use a replication slot when performing a form of wal streaming archiving (``pg_receivexlog`` or ``walreceiver`` modes). nodes (no default) A node can be described as an object of libpq key: value connection info pairs or libpq connection string or a ``postgres://`` connection uri. If for example you'd like to use a streaming replication slot use the syntax {... "slot": "slotname"}. pg_data_directory (no default) This is used when the ``local-tar`` or ``delta`` ``basebackup_mode`` is in use. The data directory must point to PostgreSQL's ``$PGDATA`` and must be readable by the ``pghoard`` daemon. prefix: (default site_name) Path prefix to use for all backups related to this site. pg_bin_directory: (default find binaries from well-known directories) Where to find the ``pg_basebackup`` and ``pg_receivewal`` (``pg_receivexlog`` for PG < 10). If a value is not supplied, ``pghoard`` will attempt to find matching binaries from various well-known locations. If ``pg_data_directory`` is set and points to a valid data directory the lookup is restricted to the version contained in the given data directory. .. _configuration_basebackup: Basebackup configuration ~~~~~~~~~~~~~~~~~~~~~~~~ The following options all concern various aspect of the basebackup process and their retention policy. basebackup_mode (default ``"basic"``) The way basebackups should be created. We support 4 different modes, the first two use ``pg_basebackup`` while the rest directly read the files from the cluster. Neither ``basic`` nor ``pipe`` modes support multiple tablespaces. ``basic`` runs ``pg_basebackup`` and waits for it to write an uncompressed tar file on the disk before compressing and optionally encrypting it. ``pipe`` pipes the data directly from ``pg_basebackup`` to PGHoard's compression and encryption processing reducing the amount of temporary disk space that's required. ``local-tar`` Can be used only when running on the same host as the PostgreSQL cluster. Instead of using ``pg_basebackup``, PGHoard reads the files directly from ``$PGDATA`` in this mode and compresses and optionally encrypts them. This mode allows backing up user tablespaces. Note that the ``local-tar`` backup mode can not be used on replica servers prior to PostgreSQL 9.6 unless the pgespresso extension is installed. ``delta`` similar to ``local-tar``, but only changed files are uploaded into the storage. On every backup snapshot of the data files is taken, this results in a manifest file, describing the hashes of all the files needed to be backed up. New hashes are uploaded to the storage and used together with complementary manifest from control file for restoration. In order to properly assess the efficiency of ``delta`` mode in comparison with ``local-tar``, one can use ``local-tar-delta-stats`` mode, which behaves the same as ``local-tar``, but also collects the metrics as if it was ``delta`` mode. It can help in decision making of switching to ``delta`` mode. basebackup_thread (default ``1``) How many threads to use for tar, compress and encrypt tasks. Only applies for ``local-tar`` basebackup mode. Only values 1 and 2 are likely to be sensible for this, with higher thread count speed improvement is negligible and CPU time is lost switching between threads. The following options define how to schedule basebackups. basebackup_interval_hours (default ``24``) How often to take a new basebackup of a cluster. The shorter the interval, the faster your recovery will be, but the more CPU/IO usage is required from the servers it takes the basebackup from. If set to a null value basebackups are not automatically taken at all. basebackup_hour (default undefined) The hour of day during which to start new basebackup. If backup interval is less than 24 hours this is the base hour used to calculate the hours at which backup should be taken. E.g. if backup interval is 6 hours and this value is set to 1 backups will be taken at hours 1, 7, 13 and 19. This value is only effective if also ``basebackup_interval_hours`` and ``basebackup_minute`` are set. basebackup_minute (default undefined) The minute of hour during which to start new basebackup. This value is only effective if also ``basebackup_interval_hours`` and ``basebackup_hour`` are set. basebackup_chunks_in_progress (default ``5``) How many basebackup chunks can there be simultaneously on disk while it is being taken. For chunk size configuration see ``basebackup_chunk_size``. basebackup_chunk_size (default ``2147483648``) In how large backup chunks to take a ``local-tar`` basebackup. Disk space needed for a successful backup is ``basebackup_chunk_size * basebackup_chunks_in_progress``. basebackup_compression_threads (default ``0``) Number of threads to use within compression library during basebackup. Only applicable when using compression library that supports internal multithreading, namely zstd at the moment. Default value ``0`` means not to use multithreading. The following options manage the retention policy. basebackup_age_days_max (default ``null``) Maximum age for basebackups. Basebackups older than this will be removed. By default this value is not defined and basebackups are deleted based on total count instead. basebackup_count (default ``2``) How many basebackups should be kept around for restoration purposes. The more there are the more diskspace will be used. If ``basebackup_max_age`` is defined this controls the maximum number of basebackups to keep; if backup interval is less than 24 hour or extra backups are created there can be more than one basebackup per day and it is often desirable to set ``basebackup_count`` to something slightly higher than the max age in days. basebackup_count_min (default ``2``) Minimum number of basebackups to keep. This is only effective when ``basebackup_age_days_max`` has been defined. If for example the server is powered off and then back on a month later, all existing backups would be very old. However, in that case it is usually not desirable to immediately delete all old backups. This setting allows specifying a minimum number of backups that should always be preserved regardless of their age. .. _configuration_archiving: Archiving configuration ~~~~~~~~~~~~~~~~~~~~~~~ active_backup_mode (default ``pg_receivexlog``) Can be either ``pg_receivexlog`` or ``archive_command``. If set to ``pg_receivexlog``, ``pghoard`` will start up a ``pg_receivexlog`` process to be run against the database server. If ``archive_command`` is set, we rely on the user setting the correct ``archive_command`` in ``postgresql.conf``. You can also set this to the experimental ``walreceiver`` mode whereby pghoard will start communicating directly with PostgreSQL through the replication protocol. (Note requires psycopg2 >= 2.7) pg_receivexlog When active backup mode is set to ``"pg_receivexlog"`` this object may optionally specify additional configuration options. The currently available options are all related to monitoring disk space availability and optionally pausing xlog/WAL receiving when disk space goes below configured threshold. This is useful when PGHoard is configured to create its temporary files on a different volume than where the main PostgreSQL data directory resides. By default this logic is disabled and the minimum free bytes must be configured to enable it. Example:: { "backup_sites": { "mysite": { "pg_receivexlog": { "disk_space_check_interval": 10, "min_disk_free_bytes": null, "resume_multiplier": 1.5 } } } :disk_space_check_interval: (default ``10``) How often (in seconds) to check available disk space. :min_disk_free_bytes: (default ``null``) Minimum bytes (in integer) that must be available in order to keep receiving xlogs/WAL from PostgreSQL. If available disk space goes below this limit a ``STOP`` signal is sent to the ``pg_receivexlog`` / ``pg_receivewal`` application. :resume_multiplier: (default ``1.5``) Number of times the ``min_disk_free_bytes`` bytes of disk space that is required to start receiving xlog/WAL again (i.e. send the ``CONT`` signal to the ``pg_receivexlog`` / ``pg_receivewal`` process). Multiplier above 1 should be used to avoid stopping and continuing the process constantly. .. _configuration_restore: Restore configuration --------------------- .. _configuration_storage: Storage configuration ~~~~~~~~~~~~~~~~~~~~~ FIXME: reformat that according to what's been done above ``object_storage`` (no default) Configured in ``backup_sites`` under a specific site. If set, it must be an object describing a remote object storage. The object must contain a key ``storage_type`` describing the type of the store, other keys and values are specific to the storage type. ``proxy_info`` (no default) Dictionary specifying proxy information. The dictionary must contain keys ``type``, ``host`` and ``port``. Type can be either ``socks5`` or ``http``. Optionally, ``user`` and ``pass`` can be specified for proxy authentication. Supported by Azure, Google and S3 drivers. The following object storage types are supported: * ``local`` makes backups to a local directory, see ``pghoard-local-minimal.json`` for example. Required keys: * ``directory`` for the path to the backup target (local) storage directory * ``sftp`` makes backups to a sftp server, required keys: * ``server`` * ``port`` * ``username`` * ``password`` or ``private_key`` * ``google`` for Google Cloud Storage, required configuration keys: * ``project_id`` containing the Google Storage project identifier * ``bucket_name`` bucket where you want to store the files * ``credential_file`` for the path to the Google JSON credential file * ``s3`` for Amazon Web Services S3, required configuration keys: * ``aws_access_key_id`` for the AWS access key id * ``aws_secret_access_key`` for the AWS secret access key * ``region`` S3 region of the bucket * ``bucket_name`` name of the S3 bucket Optional keys for Amazon Web Services S3: * ``encrypted`` if True, use server-side encryption. Default is False. * ``s3`` for other S3 compatible services such as Ceph, required configuration keys: * ``aws_access_key_id`` for the AWS access key id * ``aws_secret_access_key`` for the AWS secret access key * ``bucket_name`` name of the S3 bucket * ``host`` for overriding host for non AWS-S3 implementations * ``port`` for overriding port for non AWS-S3 implementations * ``is_secure`` for overriding the requirement for https for non AWS-S3 * ``is_verify_tls`` for configuring tls verify for non AWS-S3 implementations * ``azure`` for Microsoft Azure Storage, required configuration keys: * ``account_name`` for the name of the Azure Storage account * ``account_key`` for the secret key of the Azure Storage account * ``bucket_name`` for the name of Azure Storage container used to store objects * ``azure_cloud`` Azure cloud selector, ``"public"`` (default) or ``"germany"`` * ``swift`` for OpenStack Swift, required configuration keys: * ``user`` for the Swift user ('subuser' in Ceph RadosGW) * ``key`` for the Swift secret_key * ``auth_url`` for Swift authentication URL * ``container_name`` name of the data container * Optional configuration keys for Swift: * ``auth_version`` - ``2.0`` (default) or ``3.0`` for keystone, use ``1.0`` with Ceph Rados GW. * ``segment_size`` - defaults to ``1024**3`` (1 gigabyte). Objects larger than this will be split into multiple segments on upload. Many Swift installations require large files (usually 5 gigabytes) to be segmented. * ``tenant_name`` * ``region_name`` * ``user_id`` - for auth_version 3.0 * ``user_domain_id`` - for auth_version 3.0 * ``user_domain_name`` - for auth_version 3.0 * ``tenant_id`` - for auth_version 3.0 * ``project_id`` - for auth_version 3.0 * ``project_name`` - for auth_version 3.0 * ``project_domain_id`` - for auth_version 3.0 * ``project_domain_name`` - for auth_version 3.0 * ``service_type`` - for auth_version 3.0 * ``endpoint_type`` - for auth_version 3.0 .. _configuration_encryption: Encryption ~~~~~~~~~~ It is possible to set up encryption on a per-site basis. To generate this configuration, you can use ``pghoard_create_keys`` to generate and output encryption keys in the ``pghoard`` configuration format. encryption_key_id (no default) Specifies the encryption key used when storing encrypted backups. If this configuration directive is specified, you must also define the public key for storing as well as private key for retrieving stored backups. These keys are specified with ``encryption_keys`` dictionary. :encryption_keys: (no default) This key is a mapping from key id to keys. Keys in turn are mapping from ``public`` and ``private`` to PEM encoded RSA public and private keys respectively. Public key needs to be specified for storing backups. Private key needs to be in place for restoring encrypted backups.