Real-Time Stream Processing Using Apache Spark 3 for Python Developers
Video description
Build your own real-time stream processing applications using Apache Spark 3.x and PySpark
About This Video
Learn real-time stream processing concepts
Understand Spark structured streaming APIs and architecture
Work with file streams, Kafka source, and integrating Spark with Kafka
In Detail
Take your first steps towards discovering, learning, and using Apache Spark 3.0. We will be taking a live coding approach …
Real-Time Stream Processing Using Apache Spark 3 for Python Developers
Video description
Build your own real-time stream processing applications using Apache Spark 3.x and PySpark
About This Video
Learn real-time stream processing concepts
Understand Spark structured streaming APIs and architecture
Work with file streams, Kafka source, and integrating Spark with Kafka
In Detail
Take your first steps towards discovering, learning, and using Apache Spark 3.0. We will be taking a live coding approach in this carefully structured course and explaining all the core concepts needed along the way.
In this course, we will understand the real-time stream processing concepts, Spark structured streaming APIs, and architecture.
We will work with file streams, Kafka source, and integrating Spark with Kafka. Next, we will learn about state-less and state-full streaming transformations. Then cover windowing aggregates using Spark stream. Next, we will cover watermarking and state cleanup. After that, we will cover streaming joins and aggregation, handling memory problems with streaming joins. Finally, learn to create arbitrary streaming sinks.
By the end of this course, you will be able to create real-time stream processing applications using Apache Spark.
Audience
This course is designed for software engineers and architects who are willing to design and develop big data engineering projects using Apache Spark. It is also designed for programmers and developers who are aspiring to grow and learn data engineering using Apache Spark.
For this course, you need to know Spark fundamentals and should be exposed to Spark Dataframe APIs. Also, you should know Kafka fundamentals and have a working knowledge of Apache Kafka. One should also have programming knowledge of Python programming.
Chapter 3 : Getting Started with Spark Structured Streaming
Introduction to Stream Processing
Spark Streaming APIs - DStream Versus Structured Streaming
Creating your First Stream Processing Application
Stream Processing Model in Spark
Working with Files and Directories
Streaming Sources, Sinks and Output Mode
Fault Tolerance and Restarts
Chapter 4 : Spark Streaming with Kafka
Streaming from Kafka Source
Working with Kafka Sinks
Multi-Query Streams Application
Kafka Serialization and Deserialization for Spark
Creating Kafka AVRO Sinks
Working with Kafka AVRO Source
Chapter 5 : Windowing and Aggregates
Stateless Versus Stateful Transformations
Event Time and Windowing
Tumbling Window Aggregate
Watermarking your Windows
Watermark and Output Modes
Sliding Window
Chapter 6 : Stream Processing and Joins
Joining Stream to Static Source
Joining Stream to Another Stream
Streaming Watermark
Streaming Outer Joins
Chapter 7 : Keep Learning
Final Word
Start your Free Trial Self paced Go to the Course We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.
This site uses cookies. By continuing to use this website, you agree to their use.I Accept