Seminar Python for Data Science

  • Oct28

    In recent years, Python has become an increasingly important part of the data science, engineering and analytic tool landscape. The ecosystem keeps growing and new capabilities are added each day by a large developer community. This seminar will provide an in-depth coverage of the tools and techniques gaining traction with the data audience, including Bokeh for interactive data visualization, scikit-learn for Machine Learning and Jupyter Notebooks for sharing analysis, among others. We will also learn how to scale Python performance, including how to handle large and distributed data sets.


    Christine Doig is a Data Scientist at Continuum Analytics, a company built on the Open Source Python Data Science stack. Continuum provides Data Science Consulting and Training services and sells the Anaconda Analytic Platform. Christine started her Master’s program at FIB, UPC and has spoken and taught at many Python Conferences around the world: PyCon, Europython, PyData, SciPy and PyGotham NYC.

    Topics

    • Introduction
    • Data Science
    • Why Python?
    • State of the art of Python for Data Science
    • Setup and workflows
    • Analytics and BI (pandas, blaze, ibis, SQLAlchemy)
    • Scientific Computing (numpy, scipy, matplotlib/seaborn, numba, xray, Pytables)
    • Machine Learning, Statistics and NLP (scikit-learn, statsmodels, theano, PyMC, NLTK)
    • Distributed Systems and Big Data (Hadoop, Spark, dask, Impala/Ibis)
    • Web (Jupyter/Binder, Bokeh, Flask/Django, Scrapy, RDFLib)
    • Conclusions and Resources

    Schedule

    • Wednesday, October 28th, 2015: 15h to 19h
    • Wednesday, November 25th, 2015: 15h to 18h