Introduction to Big Data with Spark and Hadoop
Bernard Marr defines Big Data as the digital trace that we are generating in this digital era. In this course, you will learn about the characteristics of Big Data and its application in Big Data Analytics. You will gain an understanding about the features, benefits, limitations, and applications of some of the Big Data processing tools. You’ll explore how Hadoop and Hive help leverage the benefits of Big Data while overcoming some of the challenges it poses. Hadoop is an open-source framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Hive, a data warehouse software, provides an SQL-like interface to efficiently query and manipulate large data sets residing in various databases and file systems that integrate with Hadoop.
Apache Spark is an open-source processing engine that provides users new ways to store and make use of big data. It is an open-source processing engine built around speed, ease of use, and analytics. In this course, you will discover how to leverage Spark to deliver reliable insights. The course provides an overview of the platform, going into the different components that make up Apache Spark.
In this course, you will also learn about Resilient Distributed Datasets, or RDDs, that enable parallel processing across the nodes of a Spark cluster.
Explainthe impact of Big Data including use cases, tools, and processing methods.
ExplainApache Hadoop architecture, ecosystem, and practices, and userelatedapplications including HDFS, HBase, Spark, and MapReduce.
Apply Spark programming basics, including parallel programming basics forDataFrames, data sets, and Spark SQL.
UseSpark’s RDDsanddata sets, optimizingSparkSQLusing Catalyst and Tungsten, anduseSpark’s development and runtime environment options.
Syllabus
Syllabus - What you will learn from this course
Week 1
What is Big Data?
Week 2
Introduction to the Hadoop Ecosystem
Week 3
Apache Spark
Week 4
DataFrames and SparkSQL
Week 5
Development and Runtime Environment Options
Week 6
Monitoring & Tuning
FAQ
When will I have access to the lectures and assignments?
Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:
What will I get if I subscribe to this Certificate?
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.
Reviews
hands on lab and quizzes at the end of each session was very helpful
Fantastic blend of theory and practical (labs). The labs are short and have concise material.