parallel

Parallel Database Systems: The Future of High Performance Database Systems

David DeWitt, Jim Gray, 1992

Summary. Parallel databases have really taken hold for a number of reasons:

the relational model embraced within the last 10 years leads to naturally parallel queries on existing sql apps (no rewrite necessary);
the dataflow approach ofparallel queries requires an architecture that, though once ``exotic'', is commonplace now: message-based client/server OS with parallel processors.
Finally, the hardware is cheap: inexpensive processors grouped together get better $/MB and $/MIPS.

A general consensus on the architecture design.has formed: a shared-nothing architecture where each processor has its own disk & memory. Data can be partitioned from one input stream, processed, and then merged to one output stream.

Though there is still plenty of research to do, benchmark results seem to say that parallel databases are wave of the future.

More Detail.

Parallelism can be achieved in two ways: pipelined, moving already processed data to the next stage as it is available, and partitioned, dividing the data into partitions, processing them separately and then combining the output streams. Most query plans have little opportunity for pipelined parallelism.

Ideal parallelism can be measured in terms of linear speedup and linear scaleup. Speedup is the improvement gained in performance time by running a query of size X in parallel. Scaleup is the improvement gained in the amount of work that can be executed in some time T. Scaleup can be batch (the size of one large job that can be done in time T increases) or transactional (the number of small transactions that can be done in time T increases).

Barriers to achieving linear speedup:

startup costs
interference: as the degree of || grows, the communication overhead can overcome the gains.
skew: data skew: All the hot data somehow ends up being processed by one processor; execution skew

Architecture. A taxonomy for categorizing || architectures:.

shared-nothing: each processor has its own disk & memory; minimizes communication/interference so it's SCALABLE; can be built from commodity parts so no fancy hardware needed; SN systems have showed near linear speedups.
shared-memory: All processors share all the memory; easy to implement; do not scale well due to interference; interconnection network must have high bw to reduce net traffic and delays; each processor must have large cache; loading and flushing that cache impacts performance gains.
shared-disk: All processors share all the disks; doesn't scale well; okay for read-only db but to write, processor must arrange with other processors to have write access to disk; this leads to high overhead for writing.

Data Partitioning. Three methods for data partitioning (I think these.strategies can be used to partition intermediate results as well as the ``permanent'' relations).

Partition by range; could develop hot spots; allows range queries to be directed only to those processors with relevant partitions.
Partition by round-robin. Allows an even distribution.
Partition with hashing.

Partitioning can lead to data and execution skew. One strategy to avoid.skew is to look at the temperature of a tuple and attempt to make the aggregate heat of all partitions equal.

Partitioning leads to new database administration issues.

Parallelism within Relational Operators. How do we allow existing sequential operators (select,project,join) to execute in parallel? Since the operators expect the input in one stream and the output in one stream, use split tables to split the data into multiple streams. Merge the final output. Each processor will have input and output ports that the split/merged data can be directed from/to.

Problems Still to be Solved.

Mixing Batch (large, long) & OLTP (simple, short) queries.
- batch queries tend to acquire many locks which prevents progress from being made on OLTP queries.
- priority scheduling: underlying system must ensure short response times & low variance for short transactions.
- priority inversion: low-priority client needs high-priority server; temporarily raise low-priority client's priority so it releases high-priority server quickly.
Parallel Query Optimization: No query optimizer considers all plans.
Application programs do not currently take advantage of the fact that they're running on a parallel machine.
Database design (where/how many indices, how to partition) tools.
On-line data reorganization: utilities for huge databases can take weeks to complete. Should:
- allow online processing to continue.
- allow incremental operation such that only a portion of the data is being worked on at a time.
- exploit parallelism.
- be recoverable such that the operation can be cancelled and old state returned.