.. index:: High availability, Replication, Transactions, Availability,

.. _#ug-ha:

25. Replication and High Availability
=================================================

This chapter discusses the support that AMPS provides for replication,
and how AMPS features help to build systems that provide high
availability.

.. include:: ./ha-overview.inc
High Availability Scenarios
---------------------------------

You design your high availability strategy to meet the needs of your
application, your business, and your network. This section describes
commonly-deployed scenarios for high availability.

Failover Scenario
^^^^^^^^^^^^^^^^^^^^^^

One of the most common scenarios is for two AMPS instances to replicate
to each other. This replication is synchronous, so that both instances
persist a message before AMPS acknowledges the message to the publisher.
This makes a hot-hot pair. In the figure below, any messages published
to ``important_topic`` are replicated across instances, so both
instances have the messages for ``important_topic``.

.. image:: ../png/Hot-Hot-Replication.png

Two connections are shown in the diagram to demonstrate the
required configuration. However, because these instances
replicate to each other, AMPS can optimize this replication
topology to use a single network connection (although AMPS
treats this as two one-way connections that happen to share
a single network connection.) 

Because AMPS replication is peer-to-peer, clients can
connect to either instance of AMPS when both are running.
With this configuration, clients are configured with Instance 1 and Instance 2 as
equivalent server addresses. If a client cannot connect to one instance,
it tries the other. Because both instances contain the same messages for
``important_topic``, there is no functional difference in which instance
a client connects to from the point of view of a publisher or subscriber.
Each instance will contain the same messages for replicated topics. Messages
can be published to either instance of AMPS at any time, and those
messages will be replicated to the other instance.

Because these instances are intended to be equivalent message sources
(that is -- a client may fail over from one instance to another instance),
these instances are configured to use ``sync`` acknowledgment to publishers.
What that means is that, when a message is published to one of these instances,
that instance does not acknowledge the message to the publisher as persisted
until both instances have written the message to disk (although the message can
be delivered to subscribers once it is persisted locally). This means that
a publisher using a publish store can fail over to either of these
servers without risk of message loss.

Geographic Replication
^^^^^^^^^^^^^^^^^^^^^^^^^^^

AMPS is well suited for replicating messages to different regions, so
clients in those regions are able to quickly receive and publish
messages to a local instance. In this case, each region replicates all
messages on the topic of interest to the other two regions. A variation
on this strategy is to use a region tag in the content, and use content
filtering so that each replicates messages intended for use in the other
regions or worldwide.

.. image:: ../png/GeoRepl.png

For this scenario, an AMPS instance in each region replicates to an
instance in the two other regions. To reduce the memory and storage
required for publishers, replication between the regions uses
``async`` acknowledgment, so that once an instance in one region
has persisted the message, the message is acknowledged back to the
publisher.

In this case, clients in each region connect *only* to the AMPS instance in
that region. Bandwidth within regions is conserved, because each message
is replicated once to the region, regardless of how many subscribers in
that region will receive the message. Further, publishers are able to
publish the message once to a local instance over a relatively fast
network connection rather than having to publish messages multiple times
to multiple regions.

To configure this scenario, the AMPS instances in each region are
configured to forward messages to known instances in the other two
regions.

Geographic Replication with High Availability
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Combining the first two scenarios allows your application to distribute
messages as required and to have high availability in each region. This
involves having two or more servers in each region, as shown in the
figure below.

.. image:: ../png/replication_scenario_with_ha.png

Each region is configured as a group, indicating that the instances
within that region should be treated as equivalent, and are intended
to have the same topics and messages. Within each group, the instances
replicate to each other using ``sync`` acknowledgments, to ensure that
publishers can fail over between the instances. Because a client in a
given region does not connect to a server outside the region, we can
configure the replication links between the regions to use ``async``
acknowledgment, which could potentially reduce the amount of time
that an application publishing to AMPS must store outgoing messages
before receiving an acknowledgment that a given message is persisted.
(Setting these links to use ``async`` acknowledgment does not affect
the speed of replication or change the behavior of replication in
any other way -- this setting only specifies when an instance of AMPS
acknowledges the message as persisted.)

The figure below shows the expanded detail of the configuration for these servers.

.. image:: ../png/chicago_group.png

The instances in each region are configured to be part of a group for
that region, since these instances are intended to have the same topics
and messages. Within a region, the instances replicate to each other
using ``sync`` acknowledgment. Replication connections to instances
at the remote site use ``async`` acknowledgment.
The instances use the replication downgrade action to ensure that
publishers do not retain an unworkably large number of messages
in the event that one of the instances goes offline. As with all
connections where instances replicate to each other, this replication
is configured as one connection in each direction, although AMPS may
optimize this to a single network connection.

Each instance at a site ensures that it provides passthrough replication
to the other instance for both the local group and the remote groups. To
optimize bandwidth, the instances at a site *may* only provide passthrough
to the remote instance for the local group. This ensures that once a
message arrives at the local group (either from a remote group or over
replication from a remote group), it is fully distributed to the local
group. To optimize bandwidth, at the risk of slightly increasing the
chances of message loss if an entire region goes offline, each instance
at a site only passes through messages from the local group to remote
sites.  This configuration balances
fault-tolerance and performance, and attempts to minimize the
bandwidth consumed between the sites.

Each instance at a site replicates to the remote sites. The instance
specifies one ``Destination`` for each remote site, with the servers at
the remote site listed as failover equivalents for the remote site. With
the passthrough configuration, this ensures that each message is
delivered to each remote site exactly once. Whichever server at the
remote site receives the message, distributes it to the other server
using passthrough replication. Notice that some features of AMPS,
such as distributed queues (though not LocalQueue or GroupLocalQueue),
require full passthrough to ensure correct delivery of messages.

With this configuration, publishers at each site publish to a
local AMPS instance, and subscribers subscribe to messages from their
local AMPS instances. Both publishers and subscribers use the high
availability features of the AMPS client libraries to ensure that if the
primary local AMPS instance fails, they automatically fail over to the
other instance. Replication is used to deliver both high availability
and disaster recovery. In the table below, each row represents a
replication destination. Servers in brackets are represented as sets of
``InetAddr`` elements in the ``Destination`` definition.

+---------------+----------------------+--------------------------------------+---------------+
| Server        |  Group               | Destinations                         | PassThrough   |
|               |                      |                                      |               |
+===============+======================+======================================+===============+
| Chicago 1     |  Chicago             | * Chicago 2 / sync ack               |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [NewYork 1, NewYork 2] / async ack |  Chicago      |    
|               |                      +--------------------------------------+---------------+
|               |                      | * [London 1, London 2] / async ack   |  Chicago      |
+---------------+----------------------+--------------------------------------+---------------+
| Chicago 2     | Chicago              | * Chicago 1  / sync ack              |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [NewYork 1, NewYork 2] / async ack |  Chicago      |
|               |                      +--------------------------------------+---------------+
|               |                      | * [London 1, London 2] / async ack   |  Chicago      |
+---------------+----------------------+--------------------------------------+---------------+
| NewYork 1     | NewYork              | * NewYork 2 / sync ack               |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [Chicago 1, Chicago 2] / async ack |  NewYork      |
|               |                      +--------------------------------------+---------------+
|               |                      | * [London 1, London 2] / async ack   |  NewYork      |
+---------------+----------------------+--------------------------------------+---------------+
| NewYork 2     | NewYork              | * NewYork 1  / sync ack              |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [Chicago 1, Chicago 2] / async ack |  NewYork      |
|               |                      +--------------------------------------+---------------+
|               |                      | * [London 1, London 2] / async ack   |  NewYork      |
+---------------+----------------------+--------------------------------------+---------------+
| London 1      | London               | * London 2 / sync ack                |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [Chicago 1, Chicago 2] / async ack |  London       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [NewYork 1, NewYork 2] / async ack |  London       |
+---------------+----------------------+--------------------------------------+---------------+
| London 2      | London               | *  London 1 / sync ack               |  ``.*``       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [Chicago 1, Chicago 2] / async ack |  London       |
|               |                      +--------------------------------------+---------------+
|               |                      | * [NewYork 1, NewYork 2] / async ack |  London       |
+---------------+----------------------+--------------------------------------+---------------+

**Table 25.1:** *Geographic Replication with HA Destinations*

.. warning::

    In a configuration like the one above, a publisher must only be allowed
    to fail over to other instances in its own region. Because replication
    to other regions use ``async`` acknowledgment, a publisher may
    have received an acknowledgment that a given message is persisted
    before it is stored in instances in the other regions.



Complex Replication: Hub and Spoke Topology   
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For more complex replication topologies, or in a situation where an
installation may want to scale out to accommodate an ever-increasing
number of subscribers, consider using a "hub and spoke" topology.

This topology is particularly useful in cases where a large
number of applications need to operate over the same data, but
the applications themselves are largely independent of each other,
where data is consumed in a different region or different
organization than where the data originates, or in cases where
given applications require intensive CPU or memory resources
to work with the data, whereas other applications using the
same data do not require these resources. For example,
if two applications have different CPU-intensive views
over the same data, isolating those applications into
separate application instances can help to reduce the
resources required for any one instance.

In this topology, replication is handled by AMPS instances dedicated
to managing replication, as shown in the diagram below. In this
strategy, each instance has one of three distinct roles:

* An *ingestion* instance accepts messages from a publisher into
  the AMPS replication fabric. All ingestion instances replicate
  to each other (using ``sync`` acknowledgment) and replicate
  to the hub (also using ``sync`` acknowledgment).

  The ingestion instances do not define a state of the world.

* One or more *hub* instances that accept messages from the
  ingestion instances and replicate those messages to the
  application instances.

  The hub instances do not replicate back to the ingestion
  instances, and they do not define a state of the world.

* The *application* instances provide messages to
  applications that use the messages.

  These instances do not replicate back to the
  hub instances. If an application will use
  multiple instances, these instances replicate
  to each other using ``sync`` acknowledgment.

  The application instances define the state of the
  world as needed -- any Topics, Views,
  ConflatedTopics, LocalQueues, or GroupLocalQueues
  that the application will use.

  Different applications may use different
  application instances: each application instance
  only needs to define the state of the world
  that the applications that use that instance
  need.

This architecture provides decoupling between publishers and
applications, and decoupling between different applications that
use the same message stream.

This topology also reduces the risk and expense of adding
more instances for application use. Only the "hub" instances
need to be updated to add or remove application instances.
Since the "hub" maintains only messages for replication (no
state of the world is defined on the "hub" instances), adding
or removing a destination at the hub instance is very efficient.
Recovery times for the hub are very quick since the only
state that needs to be recovered is the state of the
transaction log itself.

In the simplest configuration, the "hub" instance or instances
simply pass through all messages and all topics to all
downstream instances, leaving the application
instances to determine what topics should be replicated.
In more sophisticated configurations, the "hub" instances can
direct topics for specific applications to a specific set of
instances.

The hub and spoke topology has the following advantages:

* Easy to add and remove instances to a replication fabric.
* Allows the ability to create autonomous groups of instances
  servicing a given application.
* High resilience to failures within the application
  instances.
* In many situations, reduces the bandwidth required
  to keep a large number of instances up to date
  (as compared with direct replication between the
  instances).

The hub and spoke topology has the following limitations:

* In some topologies, may have higher latency for active
  publishes.
* Does *not* support fully distributed queues (use
  local queues or group local queues on a subset of
  instances instead).
* Requires an instance of AMPS (or two, for HA) that
  does not have client activity.
* Requires exclusion of replication validation to
  the hub instance.

The diagram below shows an example of the hub and spoke topology:

.. image:: ../png/replication_hub.png

.. include:: ./replication.inc
.. include:: ./ha.inc
.. include:: ./queue_replication.inc
.. include:: ./bootstrap.inc
.. include:: ./replication_best_practices.inc