Вы находитесь на странице: 1из 3

Disaster Recovery

There are several options that Vertica customers use to maintain an environment for handling disaster recovery scenarios: Backup and Restore Parallel Data Loads Replication High Availability

Backup and Restore


This option provides a logical active-passive configuration where the DR database cluster is unavailable whilst the data is being maintained; The Vertica backup and restore process is powered by RSYNC which performs an initial full backup ONLINE copying the database to the DR database cluster, and then incremental ONLINE backups can be performed on regular intervals (e.g. 30 minutes based on the customers RTO/RPO requirements) to maintain the DR database cluster. If the production cluster should fail, the database on the DR server is mounted and users via Virtual IP or DNS alias are rerouted to the DR cluster (transparent to the users, depending on the configuration users may just need to reconnect or re-submit the query)

Parallel Data Loads


Vertica recommends this option and is widely used by the Vertica customer base, where there are 2 separate environments (Production and DR) and data feeds are duplicated and loaded into the production and DR server at the same time. There are several reasons to why Vertica recommends this option: 1. Both environments are available at the same time (active-active), and users/sessions can be distributed between the 2 environments whilst DR is not required. 2. The hardware between the Production and DR environments do not have to be identical. 3. Environment can be used to prevent any downtime for a Vertica cluster upgrade e.g. DR can be used whilst production is being upgraded, and then service flipped back to the production environment whilst the DR environment is being upgraded and vice versa.

Replication (CDC)
The Vertica database supports the use of replication (Change Data Capture) to synchronize production and DR database clusters. Vertica does not provide this functionality as part of the Vertica solution but with the use of 3rd party tools such as: Informatica DatabaseSync (previously known as WisdomForce) http://www.wisdomforce.com/ Talend Integrated Suite http://www.talend.com/products/enterprise-di.php

This option provides a warm standby solution where both the Production and DR database clusters are available.

High Availability and Recovery


Vertica's unique approach to recovery of a failed node is based on the distributed nature of a database. A Vertica database is said to be *K-safe if any node can fail at any given time without causing the database to shut down. When the lost node comes back online and rejoins the database, it recovers its lost objects by querying the other nodes.

*K-safety is a measure of fault tolerance in the database cluster. The value K represents the number of replicas of the data in the database that exist in the database cluster. These replicas allow other nodes to take over for failed nodes, allowing the database to continue running while still ensuring data integrity. If more than K nodes in the database fail, some of the data in the database may become unavailable. In that case, the database is considered unsafe and automatically shuts down. It is possible for a Vertica database to have more than K nodes fail and still safely continue running. The database continues to run as long as every data segment is available on at least one functioning node in the cluster. Potentially, up to half the nodes in a database with a K-safety level of 1 could fail without causing the database to shut down. As long as the data on each failed node is available from another active node, the database continues to run. Note: If half or more of the nodes in the database cluster fail, the database will automatically shut down even if all of the data in the database is technically available from replicas. This behavior prevents issues due to network partitioning.

Вам также может понравиться