Deploying large scale distributed systems
Deploying services such as DynamoDB is quite challenging. DynamoDB is a distributed system with a lot of servers serving mission critical workloads for customers. Some of these nodes are stateless and some are stateful (store customer data). Unlike a traditional relational database, DynamoDB takes care of deployments without the need for maintenance windows. Deployments need to be safe, without impacting security, durability, availability or performance. This blog covers critical tips that took days, months & years to learn deploying distributed services at Amazon DynamoDB. Deployments challenges Roll-backs Distributed system deployments are non atomic. A deployment takes the software from one state to another state. It’s not just the end state and the start state of the software that matters; there could be times when the newly deployed software doesn’t work and needs a rollback. The rolled-back state might be different from the initial state of the software. The rollback procedu