Carlos Ordonez: Algorithms and Optimizations for Big Data Analytics

Quan?

18/06/2012 de 12:00 a 13:00 (Europe/Madrid / UTC200)

On?

Sala de Juntes FIB

Afegiu l'esdeveniment al calendari

iCal

Data mining remains an important research area in database systems and a major challenge in computer science. This problem has renewed interest with so-called big data analytics. We present a review of processing alternatives, programming languages, storage, algorithms, data structures and optimizations that enable data mining on large data sets. We focus on the computation of well-known multidimensional statistical learning models. We carefully compare SQL (together with UDFs) and MapReduce as two competing technologies for big data analytics, exploiting parallel computing. We outline solved major problems and open research issues.
Bio:
Carlos Ordonez received a degree in applied mathematics and an M.S. degree in computer science, from UNAM University, Mexico, in 1992 and 1996, respectively. He got a Ph.D. degree in Computer Science from the Georgia Institute of Technology, in 2000. Dr Ordonez worked six years extending the Teradata parallel DBMS with data mining algorithms. Carlos had the opportunity to collaborate in more than 20 data mining projects from many companies with large data warehouses. His research is centered on the integration of statistical and data mining techniques into relational database systems and their application to scientific problems.
home page: http://people.cs.uh.edu/~ordonez/home.html

Data mining remains an important research area in database systems and a major challenge in computer science. This problem has renewed interest with so-called big data analytics. We present a review of processing alternatives, programming languages, storage, algorithms, data structures and optimizations that enable data mining on large data sets. We focus on the computation of well-known multidimensional statistical learning models. We carefully compare SQL (together with UDFs) and MapReduce as two competing technologies for big data analytics, exploiting parallel computing. We outline solved major problems and open research issues.

Bio: Carlos Ordonez received a degree in applied mathematics and an M.S. degree in computer science, from UNAM University, Mexico, in 1992 and 1996, respectively. He got a Ph.D. degree in Computer Science from the Georgia Institute of Technology, in 2000. Dr Ordonez worked six years extending the Teradata parallel DBMS with data mining algorithms. Carlos had the opportunity to collaborate in more than 20 data mining projects from many companies with large data warehouses. His research is centered on the integration of statistical and data mining techniques into relational database systems and their application to scientific problems.

home page: http://people.cs.uh.edu/~ordonez/home.html

Attachment: talk-cube.pdf (slides)