Research topic: Big Data Analytics in the Data Lake
Research interests: Passionate about data science and everything related! Currently researching the integration of multiple analytical techniques within the big data platforms.
Data is becoming highly abundant in large volumes, different varieties, and is flowing to the enterprise at high velocities which leads to the phenomenon of "Big Data" (BD). This includes new data structures created and loaded to the Business Intelligence (BI) platforms near-real-time and in big amounts. Repositories to capture, store and process such data are called Data Lakes (DL). DL is the new generation of data repositories complementing the Data Warehouse (DW) for BI and data analytics purposes, however, different from a DW, the DL stores data in its original raw format and structures, commonly involving semi-structured formats (like XML, JSON or CSV files) and unstructured files (free-text documents, e mails, etc.). DL allows for more flexibility of storing more types of enterprise data with less schema restrictions. This allows for the concept of schema-on-read which includes "fusing" different data sources for analytical purposes on-the-fly during analysis requests. For our research focus, we handle the data management challenges inside the DL by using Data Mining (DM) techniques to effectively and efficiently extract hidden relationships between data sources composed of complex semi-structured files and unstructured texts.
About: Enjoy travelling, reading about nearlly anything and everything, and geeky talks!