Quarry (Quarry)

January, 2013

Héctor Candón, Alberto Abelló, Petar Jovanovic, Sergi Nadal, Oscar Romero, Vasileios Theodorou

More info

  • Description

    The Quarry project is one of the pillar projects of the DTIM research group. Throughout the years, the project has gathered many researchers, PhD, Master, and Bachelor students. They are all working together towards the final goal of providing an end-to-end system for assisting users of various technical skills in managing the incremental design and deployment of analytical infrastructures (e.g., MD schemata and ETL processes).

    The main idea behind Quarry is to automate the complex and time-consuming task of the incremental data warehouse (DW) design from high-level information requirements. Moreover, Quarry provides tools for efficiently accommodating MD schema and ETL process designs to new or changed information needs of its end-users. Finally, Quarry facilitates the deployment of the generated DW design over an extensible list of execution engines.

    Nomenclature (source: http://dictionary.reference.com/)
    noun, plural quarries: an excavation or pit, usually open to the air, from which building stone, slate, or the like, is obtained by cutting, blasting, etc.
    verb (used with object), quarried, quarrying: to obtain (stone) from or as if from a quarry.

    CatalanPedrera

    In our context: starting from the raw conceptual knowledge of the sources available for analysis, in Quarry, we plan to identify, cut, excavate, transform, and integrate pieces to create the infrastructure which suites analytical needs of business users.

    Quarry comprises four core components: Requirements Elicitor, Requirements Interpreter, Design Integrator, and Design Deployer; as well as the Communication&Metadata layer.

     

    Quarry
    Quarry Architecture

     

    For supporting non-expert users in providing their information requirements at input, Quarry provides a graphical component, namely Requirements Elicitor. Requirements Elicitor then connects to a component Requirements Interpreter, which for each information requirement at input semi-automatically generates validated partial MD schema and ETL process designs. Quarry further offers a component called Design Integrator comprising two modules for integrating partial MD schema and ETL process designs processed so far, and generating unified design solutions satisfying a complete set of requirements. At each step, after integrating partial designs of a new requirement, Quarry guarantees the soundness of the unified design solutions and the satisfiability of all requirements processed so far. The produced DW design solutions are further sent to the Design Deployer component for the initial deployment of a DW schema and an ETL process that populates it. The deployed design solutions are then available for further user-preferred tunings and use.

    To support intra and cross-platform communication, Quarry includes a generic Communication&Metadata layer where other components can plugin to communicate with the platform.

     

    The Quarry project has resulted in several conference and journal publications, and involved many successful Bachelor and Master thesis. 

    Related publications, Master and Bachelor thesis: