DynamoDB

DynamoDB Simplified:

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It’s a fully managed, multiregion, multimaster, durable non-SQL database. It comes with built-in security, backup and restore, and in-memory caching for internet-scale applications.

DynamoDB Key Details:

  • The main components of DynamoDB are:
    • a collection which serves as the foundational table
    • a document which is equivalent to a row in a SQL database
    • key-value pairs which are the fields within the document or row
  • The convenience of non-relational DBs is that each row can look entirely different based on your use case. There doesn’t need to be uniformity. For example, if you need a new column for a particular entry you don’t also need to ensure that that column exists for the other entries.
  • DynamoDB supports both document and key-value based models. It is a great fit for mobile, web, gaming, ad-tech, IoT, etc.
  • DynamoDB is stored via SSD which is why it is so fast.
  • It is spread across 3 geographically distinct data centers.
  • The default consistency model is Eventually Consistent Reads, but there are also Strongly Consistent Reads.
  • The difference between the two consistency models is the one second rule. With Eventual Consistent Reads, all copies of data are usually reached within one second. A repeated read after a short period of time should return the updated data. However, if you need to read updated data within or less than a second and this needs to be a guarantee, then strongly consistent reads are your best bet.
  • If you face a scenario that requires the schema, or the structure of your data, to change frequently, then you have to pick a database which provides a non-rigid and flexible way of adding or removing new types of data. This is a classic example of choosing between a relational database and non-relational (NoSQL) database. In this scenario, pick DynamoDB.
  • A relational database system does not scale well for the following reasons:
    • It normalizes data and stores it on multiple tables that require multiple queries to write to disk.
    • It generally incurs the performance costs of an ACID-compliant transaction system.
    • It uses expensive joins to reassemble required views of query results.
  • High cardinality is good for DynamoDB I/O performance. The more distinct your partition key values are, the better. It makes it so that the requests sent will be spread across the partitioned space.
  • DynamoDB makes use of parallel processing to achieve predictable performance. You can visualize each partition or node as an independent DB server of fixed size with each partition or node responsible for a defined block of data. In SQL terminology, this concept is known as sharding but of course DynamoDB is not a SQL-based DB. With DynamoDB, data is stored on Solid State Drives (SSD).

DynamoDB Accelerator (DAX):

  • Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache that can reduce Amazon DynamoDB response times from milliseconds to microseconds, even at millions of requests per second.
  • With DAX, your applications remain fast and responsive, even when unprecedented request volumes come your way. There is no tuning required.
  • DAX lets you scale on-demand out to a ten-node cluster, giving you millions of requests per second.
  • DAX does more than just increase read performance by having write through cache. This improves write performance as well.
  • Just like DynamoDB, DAX is fully managed. You no longer need to worry about management tasks such as hardware or software provisioning, setup and configuration, software patching, operating a reliable, distributed cache cluster, or replicating data over multiple instances as you scale.
  • This means there is no need for developers to manage the caching logic. DAX is completely compatible with existing DynamoDB API calls.
  • DAX enables you to provision one DAX cluster for multiple DynamoDB tables, multiple DAX clusters for a single DynamoDB table or somewhere in between giving you maximal flexibility.
  • DAX is designed for HA so in the event of a failure of one AZ, it will fail over to one of its replicas in another AZ. This is also managed automatically.

DynamoDB Streams:

  • A DynamoDB stream is an ordered flow of information about changes to items in an Amazon DynamoDB table. When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table.
  • Amazon DynamoDB is integrated with AWS Lambda so that you can create triggers—pieces of code that automatically respond to events in DynamoDB Streams.
  • Immediately after an item in the table is modified, a new record appears in the table’s stream. AWS Lambda polls the stream and invokes your Lambda function synchronously when it detects new stream records. The Lambda function can perform any actions you specify, such as sending a notification or initiating a workflow.
  • With triggers, you can build applications that react to data modifications in DynamoDB tables.
  • Whenever an application creates, updates, or deletes items in the table, DynamoDB Streams writes a stream record with the primary key attribute(s) of the items that were modified. A stream record contains information about a data modification to a single item in a DynamoDB table. You can configure the stream so that the stream records capture additional information, such as the “before” and “after” images of modified items.

DynamoDB Global Tables

  • Global Tables is a multi-region, multi-master replication solution for fast local performance of globally distributed apps.
  • Global Tables replicates your Amazon DynamoDB tables automatically across your choice of AWS regions.
  • It is based on DynamoDB streams and is multi-region redundant for data recovery or high availability purposes. Application failover is as simple as redirecting your application’s DynamoDB calls to another AWS region.
  • Global Tables eliminates the difficult work of replicating data between regions and resolving update conflicts, enabling you to focus on your application’s business logic. You do not need to rewrite your applications to make use of Global Tables.
  • Replication latency with Global Tables is typically under one second.