📓
ainzzorl's wiki
  • Intro
  • Pet Projects
    • Good Code
    • LangFlipFlop
    • Soccer Adjectives
    • Word Highlighter
    • News Digest
    • Algorithms in Kotlin
    • rustorcli
  • Tech
    • Tech Articles
    • Algorithms
      • Distributed Hash Table (DHT)
      • RSA
      • Seam Carving
      • Fibonacci Heaps
      • Suffix trees, suffix arrays, etc.
    • Technologies
      • Threads vs Events
      • TLS
      • GPU
      • Web Sockets
      • OSI Model
    • Open Source
      • Homebrew
      • Standard Notes
      • LiChess
    • Bash
    • Raft
    • Quantum Computing
    • GFS
    • MapReduce
    • ZooKeeper
  • Courses
    • MIT 6.824: Distributed Systems
      • Primary-Backup Replication
      • Object Storage on CRAQ
      • Aurora
      • Cache Consistency: Frangipani
      • Takeways
      • Distributed Transactions
      • Midterm
      • Spanner
      • FaRM
      • Spark
      • Cache Consistency: Memcached at Facebook
    • Page 2
    • MIT 6.854/18.415: Advanced Algorithms + Stanford CS168: The Modern Algorithmic Toolbox
  • Math
    • Fourier Transform
    • Probabilities & Statistics
  • Places
    • Moscow
  • Books
    • Page 1
    • Page 1
    • Page 1
    • Page 1
    • Page 1
    • Tobol Mnogo Zvannyh - Ivanov
    • The Twelve Chairs/12 стульев - Ilf, Petrov
    • Beauty is a Wound - Eka Kurniawan
    • The Queen of Spades/Пиковая Дама - Pushkin
    • The Sirens of Titan - Kurt Vonnegut
    • Обитель - Захар Прилепин
    • The Faithful Executioner - Joel Harrington
    • City of Lies: Love, Sex, Death, and the Search for Truth in Tehran - Ramita Navai
    • June/Июль - Dmitry Bykov/Дмитрий Быков
    • East of Eden - John Steinbeck
    • Como Agua Para Chocolate/Like Water for Chocolate - Laura Esquivel Valdés
    • The Kukotski Enigma/Казус Кукоцкого - Lyudmila Ulitskaya/Людмила Улицкая
    • Ancillary Justice - Ann Leckie
    • Career of Evil - JK Rowling
    • The Signal and the Noise - Nate Silver
    • Don't Sleep, There are Snakes - Daniel Everett
    • Оправдание Острова - Eugene Vodolazkin
    • A Place Called Winter - Patrick Gale
    • 1491: New Revelations of the Americas Before Columbus - Charles C. Mann
    • Трудно Отпускает Антарктида - Vladimir Sanin
    • Klara and the Sun - Kazuo Ishiguro
    • The History of My Contemporary/История Моего Современника - Vladimir Korolenko
    • Life at the Speed of Light - Craig Venter
    • Misery - Stephen King
    • And Then There Were None - Agatha Christie
    • A Kim Jong-Il Production - Paul Fischer
  • Cooking
    • Marinated Mushrooms
    • Pea Soup
    • Fried Potato
    • Chimichurri
    • Komendantsky Sauce
    • Spicy mushroom marinara
    • Bruschetta
    • Solyanka with Mushrooms
    • Tofu Scramble
    • Bean Spaghetti
    • Salsa
    • Baked Mushrooms
    • Lentil Soup
    • Веганство в Москве
  • Misc Research
    • Recycling
    • Sailing Upwind
    • Viruses
  • Misc Reading
    • Harry Potter - Rowling Writing
  • Fitness
  • Languages
    • Spanish
    • Language Classification
  • Juggling
  • Life Advice
Powered by GitBook
On this page

Was this helpful?

  1. Tech

Raft

PreviousBashNextQuantum Computing

Last updated 3 years ago

Was this helpful?

Raft is a consensus algorithm for managing a replicated log.

  • Strong leader. Raft uses a stronger form of leader-ship than other consensus algorithms. For example,log entries only flow from the leader to other servers.This simplifies the management of the replicated log and makes Raft easier to understand.

  • Leader election: Raft uses randomized timers to elect leaders. This adds only a small amount of mechanism to the heartbeats already required for any consensus algorithm, while resolving conflicts simply and rapidly.

  • Membership changes: Raft’s mechanism for changing the set of servers in the cluster uses a new joint consensus approach where the majorities of two different configurations overlap during transitions. This allows the cluster to continue operating normally during configuration changes.

Consensus algorithms for practical systems typically have the following properties:

  • They ensure safety(never returning an incorrect result) under all non-Byzantine conditions, including network delays, partitions, and packet loss, duplication, and reordering.

  • They are fully functional (available) as long as any majority of the servers are operational and can communicate with each other and with clients. Thus, a typical cluster of five servers can tolerate the failure of any two servers. Servers are assumed to fail by stopping; they may later recover from state on stable storage and rejoin the cluster.

  • They do not depend on timing to ensure the consistency of the logs: faulty clocks and extreme message delays can, at worst, cause availability problems.

  • In the common case, a command can complete as soon as a majority of the cluster has responded to a single round of remote procedure calls; a minority of slow servers need not impact overall system performance.

Sub-problems:

  • Leader election: a new leader must be chosen when an existing leader fails.

  • Log replication: the leader must accept log entries from clients and replicate them across the cluster, forcing the other logs to agree with its own.

  • Safety: the key safety property for Raft is the State Machine Safety Property: if any server has applied a particular log entry to its state machine, then no other server may apply a different command for the same log index.

  • Any two majorities always overlap.

  • The leader can commit after hearing back from the half of other servers.

  • Then the leader tells the replicas to commit too. It's piggy-backed inside the next AppendEntries.

  • Why is the system focused on logs in the first place? Because it needs to apply operations in the same order. And it can reply the log after the crash.

Leader election:

  • Possible to build a system like this without a leader. But it's more efficient (while servers don't fail).

  • Each term has at most one leader.

  • There's an election timer on each server. If it expires, it assumes the current leader is dead, becomes a candidate, tries to become a leader.

    • Increment term.

    • Vote for itself.

    • Request votes.

  • Election timer is randomized, to reduce the probability of split votes. Choose random timeout every time, not just once!

  • New leader sorts out possibly divergent replicas.

Compaction:

  • Must avoid indefinitely growing logs.

  • Take state machine snapshots.

    • With "last included index" and "last included term". These are needed for checks in AppendEntry.

  • Generally servers take snapshots independently. Server sends snapshots to peers falling behind.

  • Use copy-on-write.

Fast Backup (when leader needs to find out where to append to the follower): on AppendEntries rejection, the follower also sends the term and the index of the conflicting entry, and the length of the log. Then the leader will know how to update the nextIndex.

  • They say I have to do this to pass the lab, but I didn't and it still passed :-|

    • Maybe I'll need it after I add persistence.

Persistent properties vs volatile. It only matters if it restarts. It is likely to be the performance bottleneck. Persist:

  • Log.

  • CurrentTerm.

  • VotedFor. This and CurrentTerm are needed to ensure there's only one leader.

http://nil.csail.mit.edu/6.824/2020/papers/raft-extended.pdf