Amazon Aurora Storage

Jan 10th, 2021 12:05 pm

Two fundamental concepts enables Amazon Aurora help meet requirements that need to be satisfied by any cloud based database like seamless scalability, high availability, fault tolerance, quick recovery without compromising on performance or increase in maintenance effort.

Monotonically increasing Log Sequence Number (LSN) attached to each log record which is written for changes
A multi tenant distributed storage system built for databases to which multiple database instances can be attached. The storage system performs the persistence functions of a traditional database like writing logs to disk, creating and persisting data pages i.e. the custom storage system understands log records and data pages. Also the storage system makes it possible for Aurora to segregate the compute components of databases namely the SQL layer, transaction management and caching from the storage layer

Tales of the Tail

Mar 7th, 2020 2:39 pm

Spikes in tail latency is a common challenge faced especially in large scale, parallel and interactive applications and this paper looks at the sources of this spike at the hardware, os and application layers. The study is done by performing tests and collecting fine grained measurements on three servers a custom null RPC service, Memchached and Ngnix on Linux. The measurements are compared against best achievable latency distribution by modeling these services as a queueing system. This comparison identifies the major sources of tail latency beyond that caused by workload bursts namely

Active DBMS

Feb 15th, 2020 11:55 am

Passive database management systems (DBMS) are program driven i.e. users query the current state of database and retrieve the information currently available in the database. An active database is one which automatically executes user specified actions when specified condition arise. The first paper details an architecture for an active database using Event-Condition-Action (ECA) rules as a formalism for active database capabilities. The second paper details an architecture of transforming a passive DBMS to an active DBMS.

The RUM Conjecture

May 10th, 2016 7:19 am

Data access methods need to modified or newly invented to adapt with ever changing workload requirements and hardware changes. This paper looks at the challenges in designing new access methods which increasingly needs to be application and hardware aware. The fundamental challenges faced are to minimize a) Read time - R b) Update cost - U c) memory over head - M and the conjecture made is that when optimizing the read-update-memory (RUM) overheads, optimizing in any two negatively impacts the third. Deciding which overheads to optimize for and to what extend has always been and remains the prominent part of designing access methods.

Amazon Dynamo

Apr 18th, 2016 2:45 pm

Requirements Dynamo tries to satisfy

Data read and written are identified uniquely by a key
Data size is small and stored as raw bytes that doesn’t require a relational schema
Queries doesn’t span multiple data items i.e. user queries deal with only one row at a time
Use cases that can tolerate weaker consistency for high availability and require no isolation guarantees
Can be deployed on commodity hardware in a trusted environment that doesn’t require authentication or authorization

← Older Blog Archives

Quick Notes

Things that came on the way