[Algorithm] ๋ถ„์‚ฐ ํ•ฉ์˜: raft, paxos, zab

๋ถ„์‚ฐ ํ•ฉ์˜(distributed consensus)๋Š” ๋ถ„์‚ฐ์‹œ์Šคํ…œ์˜ ์„ฑ์งˆ๊ณผ ํ•œ๊ณ„๋ฅผ ๊ฒฐ์ •์ง“๋Š” ์ฃผ์š” ์š”์†Œ ์ค‘ ํ•˜๋‚˜๋‹ค.
๋ถ„์‚ฐ ํ•ฉ์˜๋ž€, ๋…ธ๋“œ๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์ธ ํด๋Ÿฌ์Šคํ„ฐ ํ™˜๊ฒฝ์—์„œ ์—ฌ๋Ÿฌ๊ฐœ์˜ ๋…ธ๋“œ๊ฐ€ ๊ฐ™์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด์œ ํ• ๋•Œ ๋…ธ๋“œ๋ผ๋ฆฌ ๊ฐ™์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋™๊ธฐํ™”ํ•˜๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค.

๋ณ„๊ฑฐ ์•„๋‹ ๊ฒƒ ๊ฐ™์ง€๋งŒ, ์‚ฌ์‹ค ๊ธฐ๊ธฐ ์ˆ˜์ค€์˜ ์žฅ์• ๋‚˜ ๋„คํŠธ์›Œํฌ ๋”œ๋ ˆ์ด ๊ฐ™์€ ์—ฌ๋Ÿฌ ๋ณ€์ˆ˜๋“ค ๋•Œ๋ฌธ์— ์ด์— ๋Œ€ํ•œ ๊ณ ๋ ค ์—†์ด๋Š” ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ์‹œ์Šคํ…œ์„ ๋งŒ๋“ค ์ˆ˜ ์—†๋‹ค.

์—ฌ๊ธฐ์„œ๋Š” ๋ถ„์‚ฐ ํ•ฉ์˜์— ์‚ฌ์šฉ๋˜๋Š” ์ฃผ์š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ช‡๊ฐ€์ง€๋ฅผ ์ •๋ฆฌํ•ด๋ณธ๋‹ค.




raft

๋—๋ชฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋Š” ์ฃผ์š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ค‘ ํ•˜๋‚˜๋‹ค.
2013๋…„์— Diego Ongaro์™€ John Ousterhout๊ฐ€ paxos์˜ ๋‹จ์ˆœํ•œ ๋Œ€์•ˆ์œผ๋กœ์„œ ์ œ์‹œํ–ˆ๋‹ค.

์–ด์›์€ ์‹ ๋ขฐ์„ฑ(Reliable), ๋ณต์ œ(Replicated), ์ค‘๋ณต(Redundant), ๋‚ด๊ฒฐํ•จ์„ฑ(Fault-Tolerant)์˜ ๊ธ€์ž๋ฅผ ์ ๋‹นํžˆ ๋”ฐ์„œ "Re - And Fault-Tolerant" => RAFT๊ฐ€ ๋œ ๊ฒƒ์ด๋‹ค.

๋ฌธ์ œ๋ฅผ ์ตœ๋Œ€ํ•œ ๋‹จ์ˆœํ™”ํ•ด์„œ ํ•ด๊ฒฐํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค.

  1. ๋…ธ๋“œ๋ผ๋ฆฌ ํˆฌํ‘œ๋ฅผ ํ†ตํ•ด leader๋ฅผ ์„ ์ถœํ•œ๋‹ค.
  2. ๊ทธ๋Ÿฌ๋ฉด ๋™๊ธฐํ™”์— ๋Œ€ํ•œ ๋ชจ๋“  ์ฑ…์ž„์„ leader๊ฐ€ ๊ฐ–๊ณ ์„œ ๋ชจ๋“  ์ค‘๋Œ€์‚ฌ๋ฅผ ์ฒ˜๋ฆฌํ•œ๋‹ค.

leader๋Š” ๋ฐ์ดํ„ฐ ๋™๊ธฐํ™”๋‚˜ ๋กœ๊ทธ ๋ณต์ œ๋ฅผ ์ด๊ด„ํ•œ๋‹ค.
๋งŒ์•ฝ leader๊ฐ€ ์‹คํŒจํ•˜๊ฑฐ๋‚˜ ์žฅ์• ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค๋ฉด ๋‚˜๋จธ์ง€ ๋…ธ๋“œ๋“ค๋ผ๋ฆฌ ์ƒˆ leader๋ฅผ ์„ ์ถœํ•ด์„œ ์ฑ…์ž„์„ ์œ„์ž„ํ•œ๋‹ค.

์ด ๋งค์ปค๋‹ˆ์ฆ˜์˜ ๊ฐ€์žฅ ํฐ ์žฅ์ ์€ ๋‹จ์ˆœ์„ฑ๊ณผ ๊ตฌํ˜„์˜ ์šฉ์ด์„ฑ์ด๋‹ค.

cockroachDB, elasticsearch, redis, etcd, rabbitMQ, clickhouse, nats ๋“ฑ ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์ง€์›ํ•˜๋Š” ๋งŽ์€ ์ƒ์šฉ DB ์‹œ์Šคํ…œ๋“ค์—์„œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค.

mongoDB ๊ฐ™์€ ๊ฒฝ์šฐ๋„ raft์˜ ๋ณ€ํ˜•์„ ์‚ฌ์šฉํ•œ๋‹ค.
paxos๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ƒ๋‹น์ˆ˜์˜ ์‹œ์Šคํ…œ๋“ค๋„ raft๋กœ ์˜ฎ๊ฒจ๊ฐ€๋ ค๋Š” ์›€์ง์ž„์„ ๋ณด์ด๊ณค ํ•œ๋‹ค.

raft๊ฐ€ ์ถ”๊ตฌํ•˜๋Š” ๋ฐฉํ–ฅ์„ฑ์€ ๊ฐ€์šฉ์„ฑ์ด๋‹ค.
๋น ๋ฅธ ํšŒ๋ณต๊ณผ ๋ณต๊ตฌ์— ์ดˆ์ ์„ ๋งž์ท„๊ธฐ ๋•Œ๋ฌธ์—, ์žฅ์•  ์‹œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์•ฝ๊ฐ„์˜ ๊ฒฐํ•จ, ๋ฐ์ดํ„ฐ ๋ถˆ์ผ์น˜ ๋“ฑ์ด ๋‹จ์ ์œผ๋กœ ๊ผฝํžˆ๊ณค ํ•œ๋‹ค.




paxos

paxos๋Š” raft๊ฐ€ ๋“ฑ์žฅํ•˜๊ธฐ ์ด์ „์— ์ฃผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๋˜ ํ•ฉ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. 1989๋…„ ์ฆˆ์Œ์— ์ œ์•ˆ๋˜์—ˆ๋‹ค.

raft๊ฐ€ ๊ฐ€์šฉ์„ฑ์„ ์–ป๊ณ  ์•ฝ๊ฐ„์˜ ์ •ํ™•์„ฑ์„ ํฌ๊ธฐํ•œ ๊ฒƒ๊ณผ ๋ฐ˜๋Œ€๋กœ, paxos๋Š” ๊ฐ€์šฉ์„ฑ์„ ์ข€ ํฌ๊ธฐํ•œ ๋Œ€์‹  ์ •ํ™•์„ฑ์„ ๊ฐ€์ ธ๊ฐ€๋Š” ๊ฒƒ์„ ์ง€ํ–ฅํ•œ๋‹ค.

๊ตฌํ˜„์ด ๋ณต์žกํ•œ ํŽธ์ด๊ณ , ๋…ธ๋“œ์˜ ์‹ค์‹œ๊ฐ„ ์ถ”๊ฐ€/์ˆ˜์ • ๋“ฑ์— ์–ด๋ ค์›€๋„ ์žˆ๊ณ  ํ•ด์„œ ์ƒˆ๋กœ ๋งŒ๋“ค์–ด์ง€๋Š” ์‹œ์Šคํ…œ๋“ค์—๋Š” ์ž˜ ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š”๋‹ค.

์ •์˜ ์ž์ฒด๊ฐ€ ์กฐ๊ธˆ ์ถ”์ƒ์ ์ธ ํŽธ์ด๋‹ค.
Raft์™€ ๋น„์Šทํ•˜๊ฒŒ leader์˜ ๊ฐœ๋…์„ ์‚ฌ์šฉํ•˜๊ธด ํ•œ๋‹ค.

paxos๋Š” 3๊ฐ€์ง€ ์ฃผ์š” ์—ญํ• ์„ ํ†ตํ•ด ๋™๊ธฐํ™”์—์„œ์˜ ํ•ฉ์˜๋ฅผ ๊ตฌํ˜„ํ•œ๋‹ค.

  1. Proposer์€ ํ•ฉ์˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค.
    ํ•ฉ์˜๋ฅผ ์ œ์•ˆํ• ๋•Œ ์ž„์˜์˜ ํŠน์ • ์ˆซ์ž๊ฐ’ N, ์ •์กฑ์ˆ˜์— ์ฐธ์—ฌํ•  Proposer ๋ชฉ๋ก์„ ํ•จ๊ป˜ ๋„˜๊ฒจ์ค€๋‹ค. ์ˆ˜์‹  ๋Œ€์ƒ์€ Acceptor๊ฐ€ ๋œ๋‹ค. ์ด๋ฅผ prepare ๋‹จ๊ณ„๋ผ๊ณ  ํ•œ๋‹ค.
    N์€ ์ด์ „ ์ˆซ์ž๊ฐ’๋ณด๋‹ค ํ•ญ์ƒ ํฐ ๊ฐ’์„ ๋ณด๋‚ด๋„๋ก ํ•œ๋‹ค.

  2. Acceptor๋Š” Proposer์—๊ฒŒ์„œ ํ•ฉ์˜ ์ œ์•ˆ์„ ๋ฐ›๊ณ  ์ฒ˜๋ฆฌํ•œ๋‹ค.

  1. Proposer๊ฐ€ Acceptor๋กœ๋ถ€ํ„ฐ promise๋ฅผ ๋ฐ›์œผ๋ฉด, v๋ผ๋Š” ์ž„์˜์˜ ๊ฐ’์„ ์„ค์ •ํ•ด์„œ ๋‹ค์‹œ Acceptor์—๊ฒŒ Accept ์š”์ฒญ์„ ๋ณด๋‚ธ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Acceptor๋กœ๋ถ€ํ„ฐ Accept๊ฐ€ ๋–จ์–ด์ง€๋ฉด ํ•ฉ์˜์— ์„ฑ๊ณตํ•œ ๊ฒƒ์ด๋‹ค. ์ •์กฑ์ˆ˜๊ฐ€ ๋ณดํ†ต์€ ๊ทธ ์„ฑ๊ณต์˜ ๊ธฐ์ค€์ด ๋  ๊ฒƒ์ด๋‹ค.

  2. Learner๋Š” ํ•ฉ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ชจ๋‘์—๊ฒŒ ๋ฐœํ‘œํ•˜๋Š” ์—ญํ• ์„ ๋งก๋Š”๋‹ค.

paxos ์ž์ฒด์—๋Š” leader์˜ ๊ฐœ๋…์ด ๋ช…์‹œ๋˜์–ด์žˆ์ง€๋Š” ์•Š์ง€๋งŒ, proposer๊ฐ€ ์ž์‹ ์ด leader๊ฐ€ ๋˜๊ณ  ์‹ถ๋‹ค๊ณ  ์ฃผ์žฅํ•˜๊ณ  acceptor๋“ค์ด ์Šน์ธํ•œ๋‹ค๋ฉด leader๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค.

GCP Spanner, DynamoDB, ScyllaDB, CassandraDB ๋“ฑ์˜ DB ์‹œ์Šคํ…œ๋“ค์ด ์ด ๋งค์ปค๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•œ๋‹ค.
์—ฌ๊ธฐ์„œ ScyllaDB๋Š” ๊ฒฝ๋Ÿ‰ ํŠธ๋žœ์žญ์…˜์—๋งŒ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, ์ด๋งˆ์ €๋„ ๊ฑท์–ด๋‚ด๊ณ  ์žˆ๋Š” ์ค‘์ด๋ผ๊ณ  ํ•œ๋‹ค. (RAFT๋กœ ์ด์ „)

AWS ECS ๊ฐ™์€ ํด๋Ÿฌ์Šคํ„ฐ ์„œ๋น„์Šค๋„ paxos ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๊ณค ํ•œ๋‹ค.




zab

zab์€ Zookeeper Atomic Broadcast์˜ ์ค€๋ง๋กœ, ์ฃผํ‚คํผ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ถ„์‚ฐ ํ•ฉ์˜ ๋งค์ปค๋‹ˆ์ฆ˜์ด๋‹ค.
์ด ๋งค์ปค๋‹ˆ์ฆ˜์€ ์ˆœ์„œ์˜ ๋ณด์žฅ์— ์ตœ์ ํ™”๋˜์–ด์žˆ๋‹ค.

"๋ชจ๋“  ๋…ธ๋“œ์— ๋Œ€ํ•ด์„œ" ์ˆœ์„œ๊ฐ€ ๋ช…ํ™•ํ•˜๊ฒŒ ๋ณด์žฅํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค.

zab๋„ ์—ฌํƒ€ ๋ถ„์‚ฐ ๊ตฌ์กฐ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋‹จ์ผ leader์— ์—ฌ๋Ÿฌ๊ฐœ์˜ follower๊ฐ€ ๋”ฐ๋ผ์˜ค๋Š” ์ค‘์•™์ง‘๊ถŒํ˜• ๊ตฌ์กฐ๋ฅผ ์ง€ํ–ฅํ•œ๋‹ค.
raft์™€ ๋น„์Šทํ•˜๊ฒŒ ํˆฌํ‘œ๋ฅผ ํ†ตํ•ด์„œ leader๋ฅผ ์„ ์ถœํ•œ๋‹ค.

write๋Š” ๋ฌด์กฐ๊ฑด leader๊ฐ€ ์ฑ…์ž„์ง€๊ณ  ๋ฐ›์•„์„œ follow๋“ค์—๊ฒŒ ์ง์ ‘ ๋™๊ธฐํ™”ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ์•ˆ์ „ํ•œ ๋ณต์ œ๋ฅผ ์œ„ํ•ด 2๋‹จ๊ณ„ ์ปค๋ฐ‹์„ ์‚ฌ์šฉํ•œ๋‹ค.
read์˜ ๊ฒฝ์šฐ์—๋Š” ๋ชจ๋“  ๋…ธ๋“œ๊ฐ€ ๊ณตํ‰ํ•˜๊ฒŒ ๋กœ๋“œ๋ฐธ๋Ÿฐ์‹ฑ์„ ํ•œ๋‹ค.

๊ทธ๋ฆฌ๊ณ  zab์€ ํ•ญ์ƒ ์˜ˆ๋น„ leader๋ฅผ ๋ฏธ๋ฆฌ ์„ ์ถœํ•ด๋’€๋‹ค๊ฐ€, ๊ธฐ์กด leader์—๊ฒŒ ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธฐ๋ฉด ๋ฐ”๋กœ leader๋ฅผ ๊ต์ฒดํ•˜๊ณ  ๊ธฐ์กด leader๋ฅผ ๋ฒ„๋ฆฌ๋Š” ๋ฐฉ์‹์„ ์ทจํ•œ๋‹ค. ์˜ˆ๋น„ leader์—๊ฒŒ๋Š” ์ตœ์†Œํ•œ์˜ leader ์ž‘์—… ํžˆ์Šคํ† ๋ฆฌ๊ฐ€ ๋ณด๊ด€๋˜์–ด์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์†์‹ค์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.




์ฐธ์กฐ
https://medium.com/@adityashete009/the-zab-algorithm-502781c54498
https://medium.com/@remisharoon/paxos-vs-raft-vs-zab-a-comprehensive-dive-into-distributed-consensus-protocols-6243a3f6539b
https://en.wikipedia.org/wiki/Paxos_(computer_science)
https://www.scylladb.com/glossary/paxos-consensus-algorithm/
https://medium.com/@mani.saksham12/raft-and-paxos-consensus-algorithms-for-distributed-systems-138cd7c2d35a
https://gruuuuu.github.io/integration/paxos-raft/
https://stackoverflow.com/questions/45871855/paxos-vs-raft-for-leader-election
https://www.linkedin.com/pulse/zab-consensus-algorithm-yeshwanth-n