Hashing
- Turning anything into an integer value:
<!-- -->
Clients Load Balancer Servers
[ ] 11 % 4 = 3 -->[ ]-----------> []
[ ] 12 % 4 = 0 -->[ ]-----------> []
[ ] 13 % 4 = 1 -->[ ]-----------> []
[ ] 14 % 4 = 2 -->[ ]-----------> []
If you don’t send requests to the same server, we get cache misses.
-
We want a hash function which uniformly/evenly distributes traffic to our servers.
-
This maximizes cache hits.
-
What happens if we need to add a server?
<!-- -->
[ ] 11 % 5 = 1 -->[ ]-----------> []
[ ] 12 % 5 = 2 -->[ ]-----------> []
[ ] 13 % 5 = 3 -->[ ]-----------> []
[ ] 14 % 5 = 4 -->[ ]-----------> []
[ ] 15 % 5 = 0 -->[ ]-----------> []
-
Now we get a bunch of cache misses, which will reduce performance.
-
We can fix this with consistent caching:
-
This is best imagined as a circle:
-
the Clients have numbers, and the Servers are just uppercase letters.
-
Each client is assigned to the next server it can find clockwise.
<!-- -->
D C1
, - ~ ~ ~ - ,
, ' ' , C4
, , A
, ,
, ,
, , C3
, ,
, ,
, ,
, , '
C ' - , _ _ _ , ' B
C2
- Let’s add a new server (E):
- If you ever add or remove a server, we only lose (log n) of our cache misses.
<!-- -->
D C1 E
, - ~ ~ ~ - ,
, ' ' , C4
, , A
, ,
, ,
, , C3
, ,
, ,
, ,
, , '
C ' - , _ _ _ , ' B
C2
- You can also hash multiple times, which helps with uneven hashing functions.
Rendezvous Hashing
-
We can use Rendezvous hashing in order to fix this problem as well.
-
In Rendezvous hashing, each client lists a priority of servers it would like to get assigned to.
-
On a Server insertion or Deletion, then the server picks its second priority pick.
This is a generalized for of Consistent Hashing, which can deal with the range of N = 1 to Infinity.
Prev: [load-balancers](load-balancers.md) Next: [relational-databases](relational-databases.md)