Ontologies
Alberto Abelló, Besim Bilalli, Petar Jovanovic, Sergi Nadal, Oscar RomeroDescription
Ontology languages (such as RDF -considered the weakest ontology language- or OWL) provide machine-readable logics-based formalisms typically used for knowledge representation. The benefits of ontologies are mainly twofold: computers can automatically interpret the embedded semantics and, thanks to its logics-based nature, infer implicit knowledge from that explicitly stated in the ontology language.
In our research group we primarily exploit ontologies to represent schemata (e.g., from data sources or the integration schema) in a machine-readable format. We have also explored how to automatically exploit the schemata information to infer hidden patterns from data. For example, how to identify aggregation patterns and, from them, automatically propose multidimensional dimension hierarchies to build cubes from non-multidimensional sources.
Additionally, we also use ontology languages to represent metadata artefacts.
Automating the Design of Data Warehouses
There are two main approaches to design data warehouses. On the one hand, top-down approaches rely on the end-user requirements (also called demand-driven approaches). On the other hand, bottom-up approaches rely on the data available to identify multidimensional patterns from it (also called supply-driven approaches). In previous works we have proposed a hybrid approach starting from the available data but guiding its exploration based on the end-user preferences. We also focus on the automation of the process, to facilitate the designer’s task as much as possible.
In our research, we have shown that a thorough analysis of the data sources to shape the output result according to the end-user requirements, may be a good alternative when we do not know the sources (e.g., external / open data), the analysis to be made is not yet clear or they are too large to be explored manually. In this context, disposing of high-quality data sources, we can overcome the fact of lacking of expressive end-user requirements and largely automate the design process.
Related publications
2024 Besim Bilalli, Petar Jovanovic, Sergi Nadal, Anna Queralt, Oscar Romero: There is no Data Science without Data Governance: a Proposal Based on Knowledge Graphs. DOLAP 2024 2018 Sergi Nadal, Oscar Romero, Alberto Abelló, Panos Vassiliadis, Stijn Vansummeren: An Integration-Oriented Ontology to Govern Evolution in Big Data Ecosystems. CoRR 2018 2015 Rizkallah Touma, Oscar Romero, Petar Jovanovic: Supporting Data Integration Tasks with Semi-Automatic Ontology Construction. DOLAP 2015 2012 Alberto Abelló, Oscar Romero: Ontology driven search of compound IDs. Knowl. Inf. Syst. 2012 2007 Oscar Romero, Alberto Abelló: Automating multidimensional design from ontologies. DOLAP 2007