Video description
Quick, no nonsense. What more can you wish?
Jonathan Rioux, Senior Analyst
Spark in Motion teaches you how to use Spark for batch and streaming data analytics. In nearly 3 hours of hands-on video lessons, you'll get up and running with Spark, starting with the basic architecture of a Spark application. You'll explore data partitioning and accessing common application state, and then you'll deep-dive into using Spark SQL and dataframes for structured analytics. Finally, you'll use Spark Streaming to handle and process real-time data flowing into your application.
When you're doing analytics on big data systems, it can be a challenge to efficiently query, stream, filter, and consolidate data sharded across a cluster. Built especially for efficiently operating over large distributed datasets, the Spark data processing engine takes some of the weight off your shoulders. Spark features an easy-to-use interface, near-limitless upgrade potential, and performance that will knock your socks off. Spark simplifies your data infrastructure so you can focus on creating top-notch analytics.
Inside:
- Exploring the Spark Ecosystem
- Deploying Spark on a cluster
- Analytics with SparkSQL
- Real-time applications with Spark Streaming
Designed for a software engineer or architect, data scientist, or data analyst interested in getting started with Spark. No prior experience is needed.
Jason Kolter is an instructor for the University of Washington certificate program in Big Data Technologies. Additionally he has worked in a wide range of technology companies, gaining extensive experience leading teams building production large-scale distributed analytics systems.
Best course I have seen so far.
Peter J. Hampton, AI Researcher
Spark is a very valuable library, but it's very hard to use (the learning step is very steep). This video course makes the learning smoother, and takes the users to a place where they can experiment by themselves.
Alberto Boschetti, Data Scientist
Table of Contents
AN INTRODUCTION TO APACHE SPARK
What is Spark?
00:04:45
Exploring the Spark ecosystem 1
00:06:26
Functional programming using the Spark shell
00:08:48
Rich programming using notebooks
00:06:24
Using RDDs part 1: Features and creating loading
00:08:06
Using RDDs part 2: Transformations and actions
00:08:19
Spark application architecture
00:06:22
Summary
00:01:49
BUILDING REALISTIC SPARK APPLICATIONS
Deploying Spark on a cluster
00:07:11
Scaling Spark applications
00:08:58
Making iterative applications fly
00:06:43
Accessing common application state
00:04:42
Configuring the Spark runtime
00:06:05
Monitoring and metrics with the Spark Web UI
00:04:52
Summary
00:01:12
ADVANCED ANALYTICS WITH SPARK SQL AND DATASETS
Creating and using datasets
00:05:30
Structured processing using Spark SQL
00:05:27
Bringing SQL to Spark with the DataFrame API
00:05:26
Working with Spark SQL data sources
00:04:32
Interactive queries with the Spark SQL server
00:03:44
Summary
00:01:01
LOW LATENCY APPLICATIONS WITH SPARK STREAMING
What is a streaming application?
00:03:32
Understanding Spark Streaming
00:04:48
Programming Spark Streaming
00:05:24
Spark Streaming data sources
00:05:35
What is Structured Streaming?
00:07:22
Building continuous applications using Structured Streaming
00:07:20
Summary and course wrap-up
00:01:54
APPENDICES
Installing Spark
00:03:19
Installing Jupyter Notebook
00:05:04