AWS : Resiliency

  • Server-based Services such as
    • RDS
    • ElastiCache
    • ElasticSearch
    • Document DB
    • Elastic Map/reduce
  • DynamoDB
  • S3
  • Compute Services

To have a resilience platform all parts must be configured to befault tolerant. But first the level of redundancy and resilienc must be determined.

Fault Tolerant Server-based Services

Server bases services are those that are “instance” based. Services like RDS and ElastiCache are instanced based in that you can run one instance, but you will not have any fault tolerance. In order to gain high availability, you need to tell the service to provision a second instance for the primary instance to failover to, should there be an issue with it.

This model is similar to traditional data center environments. A good way to tell if a service is a server/instance based service or if the service is a pre-existing product that AWS has created a service with (MongoDB, Redis, MySQL, Postgres).

Most of the server based services have similar concepts for handling a hardware failover automatically. This functionality is the same that handles a single availability zone failure. By creating active/standby pairs of servers that replicate and by having each member of the pair in a different availability zone, you create the infrastructure that handles both of these failure modes.


ElastiCache is one of these services. You will create an ElastiCache cluster that does not have a single point of failure, and that can handle an AZ outage.

  • First, create an ElastiCache subnet group in the default VPC using each available subnet and
  • then create a multi-AZ redis cluster.

Elasticache’s Subnet Groups

AWS > Elasticache page > Subnet Groups

The Subnet Groups option is availble on the service page

Then create a new Subnet Groups. Based on the region of the selected VPC the Availability Zone is listed.

We can have only one subnet in each availability zone. See the figure above, I tried to add more but it’s not possible.

As next we create the Redis Cluster. The created Subnet in the previous step, is selected in Subnet Groups.


  • DynamoDB is a native AWS service for non-relational databases
  • It is Multi-AZ by default and can be made Multi-Region with DynamoDB Streams and by creating a Global DynamoDB table
  • Global table is multi-region and active/active (it means any changes to one table is propagated to sll other tables)
  • DynamoDB scales to extremely high loads with very fast response times
  • It also supports configuring a caching layer in front of the database.
  • DynamoDB Streams allow every change made to a DynamoDB table to be “streamed” into other services. These other services can then choose what actions to take on the different items or operations within the stream.
  • In DynamoDb each database cintains just one table
  • This table has just one primary key and optional sort keyby default
  • It’s possible to have multi primary key, sort key on the table.
  • Indeces as well
  • charge is based on operation in seconds
  • or pay on demand
  • Automatic scale up and down

DynamoDB Streams And Global Tables

DynamoDB Streams capture all changes made to a DynamoDB Table. This includes only actions that modify the table, not actions that only read from the table.

DynamoDB Global Tables take advantage of DynamoDB Streams the create Multi-Region active/active DynamoDB Tables. This allows you to modify a table in multiple regions and have those changes reflected in all regions.

Multi-Region, active/active data stores are a big deal and extremely useful for use cases that require it.

DynamoDB is a non-relational database. It is a fully managed service created by AWS. With DynamoDB you create tables, but unlike a relational database, each table is completely independent.

DynamoDB is not like the server-based services that AWS offers (RDS, ElastiCache, etc.), it is “serverless” in the sense that you do not have any control over the physical infrastructure that it runs on. You do not pay for the service when you are not using it (except used storage space). Because DynamoDB is different than server-based offerings, the mechanisms for redundancy are also different. DynamoDB offers multi-region, active/active service if you elect it. This is called DyanmoDB Global Tables.

Amazon DynamoDB is a fast NoSQL database service for all applications that need consistent, single-millisecond latency at any scale. It is a fully managed database and supports both document and key-value data models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and many other applications. [Source]

Fault Tolerant Operations

1- Subnet Groups

To get Multi-AZ availability, you need to configure a Subnet Group (contains subnets in different AZs) within the service. A subnet is attached to an AZ, and creating a grouping of subnets within the service and tells the service where it can place the primary and standby instances of a service.

Based on the service and data volumne, creating a new instance in secondary subnet, can takes diffrent duration.

2- Multiple instances

To reduce downtime to seconds, multi instances have to be created.

3- Multi-AZ checkbox

4- Multi Region Redundancy

Subnet Groups are key to creating Multi-AZ redundancy in server-based services. Subnet Groups define the different availability zones that your service will run in, and having multiple instances allow for fast failover if a single AZ were to go down.

Multi-Region redundancy is more tricky. Depending on the service, it is harder, or not possible to run a service with failover between regions.

Some services allow to have read replica in a second region, but later you have to activate it as primary e.g. Amazon RDS.

Some services don’t have multi region support e.g. Elastic Search.

Redundancy is not each somehow double the costs.

Advanced Redundancy for AWS native services

High Available Compute Services

Published by parisamoosavinezhad

- Software Engineer - Software Architect - Software and database specialist - Cloud solution architect

One thought on “AWS : Resiliency

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: