01 Oct 2017
Zookeeper
ZooKeeper: Wait-free coordination for Internet-scale systems (published USENIX:2010)
Random Notes Prof:Marcos Serfani: wrote the Zookeeper atomica protocol, which used to maintain consensus in zookeeper
Main Contribution
- Provides an abstraction on which distributed primitives such as leader election, distributed lock service can be built.
- Uses active sessions for co-ordination.
- eg. of where this abstraction is used (Katta: group membership, leader election, configuration management)
- Provides Recipes using the stated abstraction to build complex co-ordination mechanism.
- Fuzzy Snapshot with idempotent transactions. Key Ideas of the paper
Abstraction
- Zookeeper stores data in the form of a data. Data(Nodes) can be accessed via the path.
- Clients use zookeeper through a clientside library.
- Clients establish a session through which requests are sent.
- Sessions are maintained by sending hearbeats.
- Nodes have a sequential flag which is monotonically increasing over a parents previously created children
- Zookeeper implements watches through which clients recieve update notifications on nodes being watched , watches are deregistered when the session is closed
- Guarantees provided: Linearizable writes.
- Provides a blank operation sync , to bring the local server up-todate. API:create(path,data,flags) ,delete(path,version) ,exists(path,watch), getChildren(path,watch)
Examples
- Configuration Management: Processes watch a node (z) for configuration changes. They are notified when a leader makes these changes.
- Locks: Locks are acquired when a znode is created, destroyed on deleting the znode or if session is terminated. Other blocked calls are unblocked by notifying them through a watch on this znode.
- Other examples are barriers and rendezvous.
Implementation
- Upon recieving a write request, atomic broadcast protocol (something like paxos) is used to replicate changes across server.
- Updates are logged to a replay log.
- Each client is connected to a server. Read requests are handled locally, while write requests are forwarded to leader.
- Transactions are idempotent and are sent by the leader with a before and after state.
- Zookeeper takes periodic snapshots.State changes can take place while the snapshot is happenning. This does not matter as the transactions are idempotent.
- Client has to periodically send the server something(request/heartbeat) to maintain a session.
Things I did not understand
- Are watches implemented locally. Does the particular local server a node is connected to have to see the commit , for the connected to client to be notified.
- How many active sessions can be implemented ??
- When a zookeeper client realizes that the server it is connected to has failed and connects to annother server. Is this treated as a new session??
Til next time,
Sandeep Polisetty
at 00:00