This video series highlights what's new in Apache 2.0 and reviews its core concepts. The course starts with a high-level overview of Spark's components and then dives into Spark 2.0's three main themes: simplicity, speed, and intelligence.
The simplicity section describes how Spark 2.0 unifies the Spark APIs and Spark session, and how Spark 2.0 simplifies machine learning via ML pipelines. The speed section illustrates how Spark 2.0 improves …
Introduction to Apache Spark 2.0
Video description
This video series highlights what's new in Apache 2.0 and reviews its core concepts. The course starts with a high-level overview of Spark's components and then dives into Spark 2.0's three main themes: simplicity, speed, and intelligence.
The simplicity section describes how Spark 2.0 unifies the Spark APIs and Spark session, and how Spark 2.0 simplifies machine learning via ML pipelines. The speed section illustrates how Spark 2.0 improves Spark performance with the push toward whole-stage code generation. And the intelligence section provides a quick primer on Spark Streaming and an introduction to the concepts of Structured Streaming. The course is designed for data scientists and data engineers with some basic experience using machine learning tools such as Python scikit-learn.
Understand the key features of Spark 2.0 that make building production pipelines easier than ever
Learn to solve data analytics problems by performing ad hoc analysis using Spark SQL
Gain experience in building machine learning solutions using Spark ML pipelines
Start prototyping with Structured Streaming to build continuous applications
Understand the paradigm shift and benefits of Datasets and DataFrames
Learn how Datasets and DataFrames are used for Spark SQL, machine learning, and streaming
Denny Lee is a Principal Program Manager with Microsoft's Azure Cosmos DB team - Microsoft's globally distributed, multi-model database service. Previously, Denny worked as a technology evangelist for Databricks, the company founded by the creators of Apache Spark. He's been working with Spark since version 0.6 and is co-author of the book "Learning PySpark" (Packt Publishing). Denny has a Master's degree in Biomedical Informatics from Oregon Health and Sciences University.
Spark 2.0 Simplicity: Unifying Datasets And Dataframes
Unified API And Spark Session
Spark MLlib - A Primer On ML Pipelines
Spark 2.0 Speed: Tungsten Phase 2
Improving Spark Performance With The Push Toward Whole-Stage Code Generation
Spark 2.0 Intelligence: Structured Streaming
Quick Refresh Of Spark Streaming
Introducing Structured Streaming
Conclusion
Wrap Up And Thank You
Start your Free Trial Self paced Go to the Course We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.
This site uses cookies. By continuing to use this website, you agree to their use.I Accept