DTIM | UPC

Description
Multitable joins is an important topic when talking about relational Data Warehouse schemas. These use to be star schemas where the central fact table has to be joined with the surrounding dimension tables. Due to the size of the fact table it is neither feasible to sort nor to partition it. Therefore, successive nested loop joins have to be performed (if there are no indices). During the collaboration with Technische Universität Darmstadt we studied algorithms to reduce the cost of the successive nested loops by pipelining the fact table through the dimension tables, which results in a reduction in the number of steps. In the best case, all the dimension tables can be placed in memory at the same time and only one step is needed, avoiding the intermediate storage in between steps.

Related publications