IT4BI MSc Thesis in 2015
Metadata Management for Knowledge Discovery
Over the last years, knowledge discovery for Big Data has grown immensely and the requirement for automated user assistance techniques for knowledge discovery has attracted more focus. Machine-processable metadata are needed in order to support the automation so that the automated systems are able to gain access to relevant information essential for the knowledge discovery processes. The automated systems apply to numerous mining algorithm and diverse data sources. None of the existing approaches proposes a flexible and extensible model of the knowledge discovery metadata artifacts that is reusable in various systems and for varied purposes, as the metadata are typically stored in ad-hoc manners. Therefore, in this work, we propose a comprehensive, generic and extensible metamodel to enable automated intelligent discovery assistance. Aiming for semantic-awareness and the incorporation of external resources, we present the metamodel using an RDF formalization. Moreover, we provide a metadata repository where users can access, manipulate and explore the metadata. Finally, we discuss the benefits of the approach and present directions for future work.