: Incremental and Agnostic Data Integration

NextiaDI is a library for incremental and agnostic Data Integration that facilitates generating schema of heterogeneous data sources and integrating them. This website is a companion of the research paper submitted to Semantic Web Journal, where we present the method underlying our approach. NextiaDI's novelty lies on a) extraction of schemata leveraging on the structure of schemaless data sources; b) standardization of such extracted schemata into a canonical data modedl (i.e., the RDFS graph data model) using the technique of production rules; c) annotation-based schema integration for RDF graphs that allow to capture the relationships of the modeled data sources via unions and joins; d) automated derivation of the required DI constructs for specific querying systems (i.e., source schemata, schema mappings, and target schema). All such features are provided in such a way that they are agnostic of the target system, and are additionally performed in an incremental manner. NextiaDI is implemented as a java library. We showcase the effectiveness of NextiaDI to automatically generate all DI constructs of ODIN tool.

People


Publications


2022




Resources

Software repository

The source code of the system can be found in the following Github repository.

The easy way to use NextiaDI is with Maven. For Gradle just add the following dependency in your build.sbt

For bootstrapping, the following dependency is also required:

For more ways to add NextiaDI using Maven, please go here

You can check how to use NextiaDI here or see the zeppelin notebook with an explanation step by step, see demonstration section


Reproducibility

We believe in transparent and shareable research [1], [2]. Hence, in the following you can find all material (e.g., notebooks, code, answers) related to our experiments

User study

This user study aims at evaluating the efficiency and quality of NextiaDI in automatically supporting the task of schema integration compared to a conventional schema integration pipeline. The study is, hence, divided in three tasks: (i) generation of source schemata, (ii) generation of an integrated schema, and (iii) generation of mappings. In the following, you can find the all material related to this survey:


Scalability experiments

We evaluate our two technical contributions (i.e., bootstrapping and schema integration) to assess their computational complexity and runtime performance. We provide you with detailed instructions on how to reproduce the experiments presented in our work and the data sources used in each scenario:


Demonstration


Notebook step by step

A live demo for learning how to use NextiaDI is available here. Bear in mind that, in order to access them you must first login with the following credentials (user: user2, password: nextiadi). The login button can be found at the top right of the page.

Showcase

We showcase the effectiveness of NextiaDI in ODIN tool by automatically generating all DI constructs. Before these constructs were created manually. Here you can access to ODIN tool.


Last update: 2022/10/12 by Javier Flores