Video description
JupyterCon New York 2017 was a powerful gathering of the data science and AI community that has gathered around Project Jupyter over the past 15 years. Its purpose: to share how the world's most data-driven organizations use Project Jupyter to analyze their data, share their insights, and create dynamic, reproducible data science. A sampling of the featured speakers at this inaugural conference included: Fernando Perez (Lawrence Berkeley National Laboratory); Lorena Barba (George Washington University); Demba Ba (Harvard University); Safia Abdalla (nteract); Brett Cannon (Microsoft); Jeremy Freeman (Chan Zuckerberg Initiative); Rachel Thomas (fast.ai); and Nadia Eghbal (GitHub). If you're looking for an understanding as to why Jupyter has become the new frontend for data science and AI, this video compilation of JupyterCon's live presentations will provide the insights you need.
Highlights include:
- A front row view at each of JupyterCon's 55 sessions, 15 keynote addresses, and eight tutorials, including complete access to all of the conference's SOLD OUT talks, such as "Jupyter Widgets: Interactive controls for Jupyter" and "Deploying interactive Jupyter dashboards for visualizing hundreds of millions of data points in 30 lines of Python."
- Illuminating talks by 101 of the world's top Jupyter experts working at Harvard University, Continuum Analytics, UC Berkeley, DataScience.com, Bloomberg LLP, University of Pittsburgh, IBM, CUNY, Domino Data Lab, Cal Poly San Luis Obispo, Two Sigma, University of British Columbia, Civis Analytics, Columbia University, R-Brain, Lawrence Berkeley National Laboratory, Microsoft, and more.
- A thought provoking set of keynote addresses, such as Fernando Perez's (the progenitor of Jupyter) predictions for Project Jupyter's future; Wes McKinney's (Two Sigma Investment) vision for seamless computation and data sharing across languages; William Merchan (DataScience.com) outlining the three movements driving enterprise adoption of Jupyter; and Peter Wang (Continuum Analytics) describing the coevolution of Jupyter and Anaconda, two major players in the new open data science ecosystem.
- Four beginner-level Jupyter tutorials, including primers on using JupyterLab and creating JupyterLab extensions; using Jupyter widgets to build user interfaces; using Jupyter with visualization and analysis tools such as pandas, seaborn, Matplotlib, and scikit-learn; and an explanation of how Jupyter technology empowers research, engineering, and data science teams.
- Four intermediate level tutorials, including a walk-through of how UC Berkeley deployed a JupyterHub campus-wide for its students and researchers; a workflow for building interactive dashboards visualizing billions of data points interactively within a Jupyter notebook using just a few dozen lines of code; a detailed study on how to do interactive natural language processing using SpaCy, Jupyter Notebooks and tools like TensorFlow, NetworkX, and LIME; and a demonstration of high-level polyglot data analysis combining Jupyter notebooks with SQL, Python, and R.
- 21 sessions on breakthrough applications of Jupyter in research, education, and industry, including "Leveraging Jupyter to build an Excel-Python bridge," a talk about a democratizing data science Jupyter app that allows those who understand Excel, but know nothing about Python, to easily access machine learning models and advanced interactive visualizations; "A billion stars in the Jupyter Notebook," a talk about astronomers using the vaex and ipyvolume libraries to visualize and explore large, high-dimensional datasets within a Jupyter notebook; "Enhancing data journalism with Jupyter," a presentation that describes how Jupyter notebooks enable data journalism powered by input from the general public; and a talk on the Anaconda Project, an open source library that delivers lightweight, efficient encapsulation and portability of data science projects.
- Nine "reproducible research" sessions that examine the problems of sharing research results in an open and reproducible manner; five sessions about large scale JupyterHub deployments; five sessions on Jupyter extensions and customizations; four sessions on building the Jupyter community; four programmatic sessions; three core architecture sessions; three sessions on kernels that use the Jupyter architecture and clients for different programming languages; and three sessions on Jupyter subprojects and documentation.
- More than 65 hours of video presentations in total...and all of it on Safari.
Table of Contents
Keynotes
Jupyter and Anaconda: Shaking up the enterprise (sponsored by Anaconda Powered by Continuum Analytics) - Peter Wang (Anaconda Powered by Continuum Analytics)
How the Jupyter Notebook helped fast.ai teach deep learning to 50,000 students - Rachel Thomas (fast.ai)
Data science without borders - Wes McKinney (Two Sigma Investments)
Labz ‘N Da Wild 2.0: Teaching signal and data processing at scale using Jupyter notebooks in the cloud - Demba Ba (Harvard University)
Making science happen faster - Jeremy Freeman (Chan Zuckerberg Initiative)
Three movements driving enterprise adoption of Jupyter (sponsored by DataScience.com) - William Merchan (DataScience.com)
Design for reproducibility - Lorena Barba (George Washington University)
Jupyter at O’Reilly - Andrew Odewahn (O’Reilly Media)
The give and take of open source - Brett Cannon (Microsoft | Python Software Foundation)
Where money meets open source - Nadia Eghbal (GitHub)
Sponsored
Data science encapsulation and deployment with Anaconda Project and JupyterLab (sponsored by Anaconda Powered by Continuum Analytics) - Christine Doig (Anaconda Powered by Continuum Analytics)
Data science platforms: Your key to actionable analytics (sponsored by DataScience.com) - William Merchan (DataScience.com)
Building interactive applications and dashboards in the Jupyter Notebook (sponsored by Bloomberg) - Romain Menegaux (Bloomberg LP), Chakri Cherukuri (Bloomberg LP)
Mapping data in Jupyter notebooks with PixieDust (sponsored by IBM) - Raj Singh (IBM Cloud Data Services)
Reproducible dashboards and other great things to do with Jupyter (sponsored by Domino Data Lab) - Mac Rogers (Domino Data Lab)
Fueling open innovation in a data-centric world (sponsored by Anaconda Powered by Continuum Analytics) - Peter Wang (Anaconda Powered by Continuum Analytics)
Model interpretation guidelines for the enterprise: Using Jupyter’s interactiveness to build better predictive models (sponsored by DataScience.com) - Pramit Choudhary (Datascience.com)
Programmatic
Jupyter: Kernels, protocols, and the IPython reference implementation - Matthias Bussonnier (UC Berkeley BIDS), Paul Ivanov (Bloomberg LP)
JupyterLab: The next-generation Jupyter frontend - Brian Granger (Cal Poly San Luis Obispo), Chris Colbert (Project Jupyter), Ian Rose (UC Berkeley)
Jupyter frontends: From the classic Jupyter Notebook to JupyterLab, nteract, and beyond - Kyle Kelley (Netflix), Brian Granger (Cal Poly San Luis Obispo)
The Jupyter Notebook as document: From structure to application - M Pacer (Project Jupyter | Berkeley Institute for Data Science), Jess Hamrick (UC Berkeley), Damián Avila (Anaconda Powered by Continuum Analytics)
Jupyter subprojects
Teaching from Jupyter notebooks - Christian Moscardi (The Data Incubator)
Development and community
Empower scientists; save humanity: NumFOCUS—Five years in, five hundred thousand to go - Leah Silen (NumFOCUS), Andy Terrel (NumFOCUS)
Jupyter at Netflix - Kyle Kelley (Netflix)
Learning to code isn’t enough: Training as a pathway to improve diversity - Kari Jordan (Data Carpentry)
Data science made easy in Jupyter notebooks using PixieDust and InsightFactory - David Taieb (IBM), Prithwish Chakraborty (IBM Watson Health), Faisal Farooq (IBM Watson Health)
Usage and application
Industry and open source: Working together to drive advancements in Jupyter for quants and data scientists - Srinivas Sunkara (Bloomberg LP), Cheryl Quah (Bloomberg LP)
Hosting Jupyter at scale - Christopher Wilcox (Microsoft)
Data science at UC Berkeley: 2,000 undergraduates, 50 majors, no command line - Gunjan Baid (UC Berkeley), Vinitra Swamy (UC Berkeley)
Jupyter notebooks and production data science workflows - Andrew Therriault (City of Boston)
Leveraging Jupyter to build an Excel-Python bridge - Christine Doig (Anaconda Powered by Continuum Analytics), Fabio Pliger (Anaconda Powered by Continuum Analytics)
Notebook narratives from industry: Inspirational real-world examples and reusable industry notebooks - Patty Ryan (Microsoft), Lee Stott (Microsoft), Michael Lanzetta (Microsoft)
Using Jupyter at the intersection of robots and industrial biology - Danielle Chou (Zymergen)
Humans in the loop: Jupyter notebooks as a frontend for AI pipelines at scale - Paco Nathan (O’Reilly Media)
A billion stars in the Jupyter Notebook - Maarten Breddels (Kapteyn Astronomical Institute, University of Groningen)
Enhancing data journalism with Jupyter - Karlijn Willems (DataCamp)
Collaboration and automated operation as literate computing for reproducible infrastructure - yoshi NOBU Masatani (National Institute of Informatics)
Jupyter notebooks and the road to enabling data-driven teams - Skipper Seabold (Civis Analytics), Lori Eich (Civis Analytics)
Data science apps: Beyond notebooks - Natalino Busa (Teradata)
Accelerating data-driven culture at the largest media group in Latin America with Jupyter - Diogo Munaro Vieira (Globo.com), Felipe Ferreira (Globo.com)
Cloud Datalab: Jupyter with the power of BigQuery and TensorFlow - Kazunori Sato (Google)
Reproducible research and open science
How Jupyter makes experimental and computational collaborations easy - Zach Sailer (University of Oregon)
Postpublication peer review of Jupyter notebooks referenced in articles on PubMed Central - Daniel Mietchen (University of Virginia)
Deploying a reproducible course - Lindsey Heagy (University of British Columbia), Rowan Cockett (3point Science)
Lessons learned from tens of thousands of Kaggle notebooks - Megan Risdal (Kaggle), Wendy Chih-wen Kan (Kaggle)
Defactoring pace of change: Reviewing computational research in the digital humanities - Matt Burton (University of Pittsburgh)
Closing the gap between Jupyter and academic publishing - Mark Hahnel (figshare), Marius Tulbure (figshare)
Citing the Jupyter Notebook in the scientific publication process - Bernie Randles (UCLA), Hope Chen (Harvard University)
GenePattern Notebook: Jupyter for integrative genomics - Thorin Tabor (University of California, San Diego)
Opinionated analysis development - Hilary Parker (Stitch Fix)
JupyterHub deployments
Democratizing access to open data by providing open computational infrastructure - Yuvi Panda (Data Science Education Program (UC Berkeley))
Managing a 1,000+ student JupyterHub without losing your sanity - Ryan Lovett (Department of Statistics, UC Berkeley), Yuvi Panda (Data Science Education Program (UC Berkeley))
How JupyterHub tamed big science: Experiences deploying Jupyter at a supercomputing center - Shreyas Cholia (Lawrence Berkeley National Laboratory), Rollin Thomas (Lawrence Berkeley National Laboratory), Shane Canon (Lawrence Berkeley National Laboratory)
Building a notebook platform for 100,000 users - Scott Sanderson (Quantopian)
Kernels
Scala: Why hasn’t an official Scala kernel for Jupyter emerged yet? - Alexandre Archambault (Teads.tv)
Xeus: A framework for writing native Jupyter kernels - Sylvain Corlay (QuantStack), Johan Mabille (QuantStack)
Deep learning and Elastic GPUs using Jupyter - Tim Gasper (Bitfusion), Subbu Rama (Bitfusion)
Extensions and customization
Writing (and publishing) a book written in Jupyter notebooks - Andreas Mueller (Columbia University)
From Beaker to BeakerX - Matt Greenwood (Two Sigma Investments)
Beautiful networks and network analytics made simpler with Jupyter - Daina Bouquin (Harvard-Smithsonian Center for Astrophysics), John DeBlase (CUNY Building Performance Lab)
GeoNotebook: An extension to the Jupyter Notebook for exploratory geospatial analysis - Chris Kotfila (Kitware)
Building a powerful data science IDE for R, Python, and SQL using JupyterLab - Ali Marami (R-Brain Inc)
Core architecture
JupyterHub: A roadmap of recent developments and future directions - Min Ragan-Kelley (Simula Research Laboratory), Carol Willing (Cal Poly San Luis Obispo)
Documentation
Music and Jupyter: A combo for creating collaborative narratives for teaching - Carol Willing (Cal Poly San Luis Obispo)
Tutorials
Deploying interactive Jupyter dashboards for visualizing hundreds of millions of datapoints, in 30 lines of Python - James Bednar (Anaconda Powered by Continuum Analytics), Philipp Rudiger (Anaconda Powered by Continuum Analytics) - Part 1
Deploying interactive Jupyter dashboards for visualizing hundreds of millions of datapoints, in 30 lines of Python - James Bednar (Anaconda Powered by Continuum Analytics), Philipp Rudiger (Anaconda Powered by Continuum Analytics) - Part 2
Deploying interactive Jupyter dashboards for visualizing hundreds of millions of datapoints, in 30 lines of Python - James Bednar (Anaconda Powered by Continuum Analytics), Philipp Rudiger (Anaconda Powered by Continuum Analytics) - Part 3
Jupyter widgets: Interactive controls for Jupyter - Sylvain Corlay (QuantStack), Jason Grout (Bloomberg) - Part 1
Jupyter widgets: Interactive controls for Jupyter - Sylvain Corlay (QuantStack), Jason Grout (Bloomberg) - Part 2
Jupyter widgets: Interactive controls for Jupyter - Sylvain Corlay (QuantStack), Jason Grout (Bloomberg) - Part 3
How to cross the asteroid belt - Safia Abdalla (nteract) - Part 1
How to cross the asteroid belt - Safia Abdalla (nteract) - Part 2
How to cross the asteroid belt - Safia Abdalla (nteract) - Part 3
How to cross the asteroid belt - Safia Abdalla (nteract) - Part 4
Data analysis and machine learning in Jupyter - Andreas Mueller (Columbia University) - Part 1
Data analysis and machine learning in Jupyter - Andreas Mueller (Columbia University) - Part 2
Data analysis and machine learning in Jupyter - Andreas Mueller (Columbia University) - Part 3
JupyterLab tutorial - Steven Silvester (Anaconda Powered by Continuum Analytics), Jason Grout (Bloomberg) - Part 1
JupyterLab tutorial - Steven Silvester (Anaconda Powered by Continuum Analytics), Jason Grout (Bloomberg) - Part 2
JupyterLab tutorial - Steven Silvester (Anaconda Powered by Continuum Analytics), Jason Grout (Bloomberg) - Part 3
JupyterLab tutorial - Steven Silvester (Anaconda Powered by Continuum Analytics), Jason Grout (Bloomberg) - Part 4
Deploying JupyterHub for students and researchers - Min Ragan-Kelley, Carol Willing, Yuvi Panda , Ryan Lovett - Part 1
Deploying JupyterHub for students and researchers - Min Ragan-Kelley, Carol Willing, Yuvi Panda, Ryan Lovett - Part 2
Deploying JupyterHub for students and researchers - Min Ragan-Kelley, Carol Willing, Yuvi Panda, Ryan Lovett - Part 3
Interactive natural language processing with SpaCy and Jupyter - Aaron Kramer (DataScience.com) - Part 1
Interactive natural language processing with SpaCy and Jupyter - Aaron Kramer (DataScience.com) - Part 2
Interactive natural language processing with SpaCy and Jupyter - Aaron Kramer (DataScience.com) - Part 3
Interviews
Demba Ba (Harvard University) Interview
Wes McKinney (Two Sigma Investments) Interview
Jason Grout (Bloomberg) Interview
Rachel Thomas (fast.ai) Interview
Peter Wang (Anaconda Powered by Continuum Analytics) Interview
Jeremy Freeman (CZI) Interview
Matt Burton (University of Pittsburgh) Interview
Andrew Odewahn (O’Reilly Media, Inc.) Interview
Pramit Choudhary (DataScience.com) and Aaron Kramer (DataScience.com) Interview
Lorena Barba (George Washington University) Interview
William Merchan (DataScience.com) Interview