graefe.query_eval_techniques


Query Evaluation Techniques for Large Databases

Goetz Graefe, 1993

Summary. Graefe claims that there are many fields where database management systems could be highly beneficial yet are not in use because of the unwieldiness of database interfaces and the "real or conceived " poor performance for extremely large databases. Traditional file systems are being used instead.

To be able to take advantage of the many advantages of database extraction, Graefe points out that we need to fix this ``real or conceived'' performance problems. The paper is a survey of efficient algorithms and query execution techniques.


More detail

The paper begins by explaining that there is a logical separation of the physical data and the high-level user interface of the database. At the high level, we think of the data in terms of relations, tuples & lists and use ``logical operators'' to manipulate the data. At a lower level, we can think of the data in terms of records, tuple ids or even physical data blocks and use ``physical operators'' to manipulate the data. The job of the Query Optimizer is to map the logical operators onto the physical operators -- a mapping that is most likely complicated as the operators do not have one-to-gone relationships. The product of such a mapping is the Query Plan. The job of the Query Execution Engine is to execute this plan.

In building the plan, the cost of different execution alternatives must be considered. There are a number of different cost measures - I/Os being very popular. In the sections that we read for 2/6, different algorithms (and their costs) for sorting, disk access and aggregate & duplicate removal are presented.