Khazana: Flexible Wide-area Consistency Management -------------------------------------------------- SOSP'01 Poster Submission: Speaker Notes Sai Susarla John Carter University of Utah We are developing Khazana, a wide-area data store that can provide efficient scalable consistent shared data access to a variety of wide-area applications. Khazana exports a simple file-like shared data abstraction managed using a common set of mechanisms in a customizable manner. An application writer simply specifies the service requirements of their data (e.g., consistency guarantees and availability requirements), and Khazana is responsible for replicating, caching, and providing access to this data over the wide area. The need for flexible consistency management: --------------------------------------------- Caching improves data availability and access latency in many wide area services. However, implementing caching in a new service requires significant effort due to the difficulty of reusing existing caching mechanisms if the consistency needs of the new service do not match those of some existing service. This lack of reusability is an artifact of the monolithic design of consistency management in existing systems. Most services employ different combinations of a common set of consistency mechanisms. However, each implementation inseparably bundles the mechanisms that provide the needed service requirements. Although an integrated solution provides good performance for the service for which it was written, it hinders reuse. Our Approach: ------------ Consider the following issues that need to be addressed and the mechanisms commonly employed to address them: - When is access allowed to a replica? Optimistic vs. pessimistic access control - Where can updates be issued? Single-master vs. multi-master - What direction do updates get transferred? Pull- vs. push-based updates - What are the conditions that can trigger update transfer? Timer-driven, staleness-driven, eager or lazy transfer These issues are largely orthogonal to each other, and many combinations of choices are possible. By giving an application control over these choices, we can better meet their consistency needs. We are developing a wide-area data sharing framework that provides coherent wide area file access at page-granularity. Khazana is unique in that in addition to supporting multiple consistency policies on a per-file basis, it gives applications choice over the mechanisms used to implement these policies. A typical distributed service can make use of Khazana by storing its persistent shared data in the Khazana data store and setting the consistency-related attributes for this data on a per-file basis. Distributed components of the service then access shared data by contacting a nearby data server. Data servers cooperate to locate and cache data close to their access. Features of Khazana: -------------------- - A file system-like interface - Peer-to-peer data servers cooperating across a WAN - Consistency options: optimistic last-writer-wins, append-only, and strict pessimistic consistency - Per-file scalable dynamic cache hierarchy - Ability to specify per-replica tolerance to staleness - Asynchronous update notification Target Wide-area Services: -------------------------- - DataStations: distributed file store that supports multiple file sharing patterns - Chat room: concurrent appends to a transcript file - Scoreboard: broadcasts game status to registered listeners - Directory service: hash table implemented in a shared file Status: ------ A prototype WAN implementation of Khazana supports the last-writer-wins policy and choices of push- vs. pull-based and single-master vs. multi-master updates. Related Work: ------------ - Support for multiple consistency policies: Munin, WebFS, Fluid Replication, TACT - Wide-area issues in consistency management: Coda, Ficus, Bayou - Applications with specific consistency policies: Active Directory, Thor, Globe, Legion - Peer-to-peer data sharing systems: Napster, PAST, Farsite