Video description
Welcome to the leading edge of data science, where new architectures and techniques for harnessing, analyzing, and exploring vast troves of information are being built and tested. In this O’Reilly video collection, you’ll learn about recent advancements in data management, machine learning, natural language processing, crowdsourcing, and algorithm design from 11 data experts.
Each video segment captures a presentation from the Hardcore Data Science track at the 2014 Strata Conference in Santa Clara, California. Download this video collection or stream it through our HD player, and discover how today’s practitioners in both the private and public sectors are pushing the envelope of data science.
This video collection includes:
- Extreme Machine Learning
Alexander Gray (Skytree, Inc.)
How do global financial institutions and international physics projects achieve ultra-high detection rates when looking for needles in vast data haystacks? Find out what it takes to create high-performance machine-learning systems. - What the #@)*$ is Big Data? A Holistic View of Data and Algorithms
Alice Zheng (GraphLab)
By surveying Big Data sources from computational biology, high energy physics, social networks, and cell phone call records, you’ll not only understand the characteristics of Big Data, but also the algorithmic pain points of Big Learning. - Overcoming the Barriers to Production-Ready Machine-Learning Workflows
Henrik Brink (wise.io), Joshua Bloom (UC Berkeley)
Emerging algorithmic and framework technologies will enable you to leapfrog many automation issues in machine learning. With examples from real-world projects, you’ll learn what’s possible in this exciting space. - Anomaly Detection
Ted Dunning (MapR)
What do you need to build anomaly detection systems? Ted Dunning shows you systems for rate shifts, topic spotting, and network flow anomalies, using techniques such as clustering, dimensionality reduction, and density estimation. - Neural Networks for Machine Perception
Ilya Sutskever (Google Inc)
Learn how these biologically inspired machine-learning models were used to achieve recent record-breaking performances on speech and visual object recognition. - The Predictive Business
Kira Radinsky (SalesPredict)
How can businesses adapt and remain competitive in an ever-changing market? This talk shows you how businesses can use machine-learning and text-mining techniques to learn about their competition and boost their customer acquisition process. - Can We Make Big Data Management Easier?
Magda Balazinska (University of Washington)
Learn new ways to manage and process Big Data, such Personalized Service Level Agreements for cloud services; the SnipSuggest autocompletion tool for SQL queries; and PerfXplain for explaining MapReduce job performance. - Design Challenges for Real Predictive Platforms
Max Gasner (Salesforce.com)
How can we make predictive systems as ubiquitous and easy to use as relational databases? Max discusses the crucial design criteria for future predictive platforms and the kinds of interfaces they need to support. - Machine Learning Gremlins
Ben Hamner (Kaggle)
After working on hundreds of machine-learning projects, Kaggle has seen many common mistakes that can derail projects and endanger their success. Learn all about these gremlins, including how to see through their many disguises. - Algebra for Scalable Analytics
Oscar Boykin (Twitter)
Find out how to pattern analytics platforms using simple abstractions. Oscar presents interesting cases from the Algebird project, such as Bloom-Filters, HyperLogLog, Count-Min Sketch, and Min-Hash.
Table of Contents
Hardcore Data Science
Hardcore Data Science Opening Remarks - Ben Lorica
Extreme Machine Learning - Alexander Gray
What the #@)*$ is Big Data? A Holistic View of Data and Algorithms - Alice Zheng
Overcoming the Barriers to Production-Ready Machine-Learning Workflows - Henrik Brink, and Joshua Bloom
Anomaly Detection - Ted Dunning
Neural Networks for Machine Perception - Ilya Sutskever
The Predictive Business - Kira Radinsky
Can We Make Big Data Management Easier? - Magda Balazinska
Design Challenges for Real Predictive Platforms - Max Gasner
Machine Learning Gremlins - Ben Hamner
Algebra for Scalable Analytics - Oscar Boykin
Bonus Tracks
Thorn in the Side of Big Data: Too Few Artists - Chris Re
Movie Reconstruction from Brain Signals: Mind-Reading - Bin Yu