Skip to content

Latest commit

 

History

History
79 lines (61 loc) · 3.41 KB

Consistent_Hashing.md

File metadata and controls

79 lines (61 loc) · 3.41 KB

Consistent Hashing

Concepts

  • A special hashing technique that when a hash table is resized, only k/n keys need to be remapped on average.
    • k is the total number of keys.
    • n is the total number of servers.

Algorithm

  • Basic
    • Create a new ring based on the range of hashing [0, h]

      • The starting point is 0 (12 o'clock's direction).
      • The ending point is h (right next to the starting point).
    • Evenly place the servers to the ring.

      ch2

Operations

Map a key to a server

  • Process
    • Get the hash value of the key: hash(key).

    • hash(key) will be represented a position on the ring.

    • The key will assigned to the next server that appears on the circle in clockwise order from the position.

    • If the assigned server is failed:

      • The key will be be reassigned to the next server in clockwise order.

      ch3
      ch4

Add a new server to the ring

  • Example
    • Add server D (see diagram).
  • Process
    • The keys which hash(key) is in the range of [B,D] will be moved from C to D.

      ch5

Remove a server from the ring

  • Example
    • Remove server A (see diagram).
  • Process
    • All keys that were originally resided to A will move into B.

      ch6

Virtual nodes

Issues for the basic approach

  • It is impossible to keep the same size of partitions on the ring for all servers when a server is add or removed.

    ring drawio

  • The keys cannot be distributed evenly on the ring.

    ring drawio (1)

Concepts of virtual nodes

  • Use multiple virtual nodes to represent a server on the ring (Each server is responsible for multiple partitions).

  • Example

    • The virtual nodes VA1, VA2 and VA3 represent the server A.
    • The virtual nodes VC1, VC2 and VC3 represent the server C.

    ring drawio (2)

Benefits of virtual nodes

  • As the number of virtual nodes increases, the distribution of keys becomes more balanced.
  • The server with higher capacity can be assigned with more virtual nodes.

Benefits

  • Minimize reorganization/rebalance when servers are added or removed.