Aurora

http://nil.csail.mit.edu/6.824/2020/papers/aurora.pdf

  • "We believe the central constraint in high throughput data processing has moved from compute and storage to the network."

  • "Aurora brings a novel architecture to the relational database to address this constraint, most notably by pushing redo processing to a multi-tenant scale- out storage service, purpose-built for Aurora."

  • "Instance lifetime does not correlate well with storage lifetime. Instances fail. Customers shut them down. They resize them up and down based on load. For these reasons, it helps to decouple the storage tier from the compute tier."

  • "In Aurora, we have chosen a design point of tolerating (a) losing

    an entire AZ and one additional node (AZ+1) without losing data,

    and (b) losing an entire AZ without impacting the ability to write

    data. We achieve this by replicating each data item 6 ways across

    3 AZs with 2 copies of each item in each AZ."

Not very easy to follow.

The FAQ explains it greatly though.


Mirrored MySQL involves an ordinary MySQL database server that thinks
it is writing to a local disk. Each transaction involves a bunch of
large writes to update B-Tree pages; even if a transaction only
modifies a few bytes of data, the resulting disk writes are entire
file system blocks, perhaps 8192 bytes. The mirroring arrangement
sends those 8192-byte blocks over the network to four different EBS
storage servers, and waits for them all to write the data to their
disks.

Aurora, in contrast, only sends little log records over the network to
its storage servers -- the log records aren't much bigger than the
actual bytes modified. So Aurora sends dramatically less data over the
network, and is correspondingly faster.

FT Goals:

  • Write with one dead AZ.

  • Read with one dead AZ + 1.

  • Resilient to transient slowness.

  • Fast re-replication.

Quorum replication:

  • N replicas.

  • W for writes.

  • R for reads.

  • R + W = N + 1

Last updated