=head1 Transaction Management in the R* Distributed Database Management System C.Mohan, B.Lindsay, R.Obermarck, IBM, 1986 B The authors analyze three commit protocols for transaction processing: I (2P), I (PA), and I (PC). Based upon the number of log writes, asynchronous writes and messages sent, PA and PC are better protocols. Whether PA or PC is a better choice can be determined by the transaction mix of a DDBMS or decided at the start of the commit protocol. The authors also discuss deadlock detection and resolution in R*. Wait-for graphs are calculated locally, local deadlocks are detected and resolved and the resulting graphs are exchanged. =head1 More Detail I is an extension of 2P that can be used in systems where the processes are arranged in a heirarchy such that a process only communicates with its parent and children and doesn't know of any other processes. For this case, 2P's two-level coordinator/subordinates doesn't work. In H2P, a root processes are coordinators, leaf-nodes are subordinates and non-leaf & non-root nodes are coordinators for their children and subordinates of their parents. When communicating with a coordinator, subordinates must respond on behalf of all of their children. For example, if one sub. process answers NO to a PREPARE message, the coordinator must answer NO. PA and PC are optimizations on Heirarchical 2P. B The observation is that a coordinator is safe to write an ABORT record, send ABORT messages to subordinates and then forget about it. On response to an inquiry about the xact, one can assume under PA that if there is B in the log about a transaction commiting, then the xact aborted (this is the case is 2P also, i believe). Actions that can be skipped: =over 4 =item * Abort record need not be forced; if we fail, abort will be assumed. =item * No ACKS need to be sent back up to coordinators; if subordinates fail, their I will send an inquiry to the coordinator to find out the status. =item * Coordinator need not record names of subordinates; if coordinator fails before commit, will assume abort on recovery and subordinates will inquire eventually. =item * No END record need be written by coordinator; if we send an ABORT and then some subordinate fails, that subordinate will inquire later as to the status of the xact. =back B A read-only xact is one in which no process actually performs updates over the course of the xact. In this case, no log records are ever written so it doesn't matter if the xact aborts or not. A partially read-only xact is one in which some processes were read-only. A read-only process will respond to a PREPARE with a READ VOTES, releases locks and forgets the xact. B log records are ever written. READ VOTES are propogated up the tree if all child nodes of a non-leaf node respond with READ VOTES. There will not be a second phase of the protocol if only READ VOTES are collected. B PA optimizes the abort case, but this is not the common case. PA optimizes for commits by introducing an extra log record type, COLLECTING, which logs all of the subordinates safely before sending a PREPARE message. On restart, if the recovery process finds a COLLECTING record, it force-writes an ABORT record, informs subordinates, receives ACKS, and writes end record. If B is found in the log when a subordinate inquires, the transaction is presumed to have committed. I I I The following actions are eliminated: =over 4 =item * Commit records need not be forced (only ABORTS); if a xact had aborted, the log record would have been force-written. =item * ACK only aborts, not commits; if a subordinate fails, it can inquire upon recovery what the status is. =item * writes an END record only for aborts, not commits; upon coordinator recovery, can always send to the COLLECTING list of subordinates the COMMIT message; subordinates can inquire if they never got first message. =item * in read-only xact, will only write the COMMIT record in the log (in PA, no record is written ??). =back B There is a Deadlock Detector (DD) at each site. Wait-for graphs are built locally and analyzed. The outcomes could be some deadlock detection and some Potential Global Deadlock Cycles (PGDC). Local deadlocks are resolved by shooting some xact. PGDCs are lists of xacts in which each xact has a dependency on the one below it. The PGDCs are sent around in such a way that they travel only in the direction of the potential deadlock cycle. A victim is chosen locally; ties are broken by using some measure such as the youngest xact.