parallel
David DeWitt, Jim Gray, 1992
Summary.
Parallel databases have really taken hold for a number of reasons:
- the relational model embraced within the last 10 years leads to
naturally parallel queries on existing sql apps (no rewrite necessary);
- the dataflow approach ofparallel queries requires an architecture
that, though once ``exotic'', is commonplace now: message-based
client/server OS with parallel processors.
- Finally, the hardware is cheap: inexpensive processors grouped together get
better $/MB and $/MIPS.
A general consensus on the architecture design.has formed: a shared-nothing architecture where each processor has its own
disk & memory. Data can be partitioned from one input stream,
processed, and then merged to one output stream.
Though there is still plenty of research to do, benchmark results seem
to say that parallel databases are wave of the future.
Parallelism can be achieved in two ways: pipelined, moving already
processed data to the next stage as it is available, and partitioned,
dividing the data into partitions, processing them separately and then
combining the output streams. Most query plans have little opportunity
for pipelined parallelism.
Ideal parallelism can be measured in terms of linear speedup and
linear scaleup. Speedup is the improvement gained in performance time
by running a query of size X in parallel. Scaleup is the improvement gained
in the amount of work that can be executed in some time T. Scaleup can be
batch (the size of one large job that can be done in time T increases) or
transactional (the number of small transactions that can be done in time T
increases).
Barriers to achieving linear speedup:
- startup costs
- interference: as the degree of || grows, the communication overhead
can overcome the gains.
- skew: data skew: All the hot data somehow ends up being processed
by one processor; execution skew
Architecture. A taxonomy for categorizing || architectures:.
- shared-nothing: each processor has its own disk & memory; minimizes
communication/interference so it's SCALABLE; can be built from
commodity parts so no fancy hardware needed; SN systems have showed
near linear speedups.
- shared-memory: All processors share all the memory; easy to
implement; do not scale well due
to interference; interconnection network must have high bw to reduce
net traffic and delays; each processor must have large cache; loading and
flushing that cache impacts performance gains.
- shared-disk: All processors share all the disks; doesn't scale well;
okay for read-only db but to write, processor must arrange with other
processors to have write access to disk; this leads to high overhead for
writing.
Data Partitioning. Three methods for data partitioning (I think these.strategies can be used to partition intermediate results as well as the
``permanent'' relations).
- Partition by range; could develop hot spots; allows range queries to
be directed only to those processors with relevant partitions.
- Partition by round-robin. Allows an even distribution.
- Partition with hashing.
Partitioning can lead to data and execution skew. One strategy to avoid.skew is to look at the temperature of a tuple and attempt to make
the aggregate heat of all partitions equal.
Partitioning leads to new database administration issues.
Parallelism within Relational Operators.
How do we allow existing sequential operators (select,project,join)
to execute in parallel? Since the operators expect the input in one
stream and the output in one stream, use split tables to split the
data into multiple streams. Merge the final output. Each processor
will have input and output ports that the split/merged data can be directed
from/to.
Problems Still to be Solved.
- Mixing Batch (large, long) & OLTP (simple, short) queries.
- batch queries tend to acquire many locks which prevents progress
from being made on OLTP queries.
- priority scheduling: underlying system must ensure short response
times & low variance for short transactions.
- priority inversion: low-priority client needs high-priority server; temporarily raise low-priority client's priority so it releases high-priority server quickly.
- Parallel Query Optimization: No query optimizer considers all plans.
- Application programs do not currently take advantage of the fact that
they're running on a parallel machine.
- Database design (where/how many indices, how to partition) tools.
- On-line data reorganization: utilities for huge databases can
take weeks to complete. Should:
- allow online processing to continue.
- allow incremental operation such that only a portion of the data
is being worked on at a time.
- exploit parallelism.
- be recoverable such that the operation can be cancelled and old
state returned.