Hands-on NLP with NLTK and Scikit-learn

Video description

A complete Python guide to Natural Language Processing to build spam filters, topic classifiers, and sentiment analyzers

About This Video

Build actual solutions backed by machine learning and Natural Language Processing models, instead of meandering in theory and mathematical symbols.
Single-handedly build three models, one for spam filtering, 0ne for sentiment analysis, and finally one for text classification.
Get the right foundation …

Hands-on NLP with NLTK and Scikit-learn

Video description

A complete Python guide to Natural Language Processing to build spam filters, topic classifiers, and sentiment analyzers

About This Video

Build actual solutions backed by machine learning and Natural Language Processing models, instead of meandering in theory and mathematical symbols.
Single-handedly build three models, one for spam filtering, 0ne for sentiment analysis, and finally one for text classification.
Get the right foundation from which to do applied, actual Natural Language Processing. We show you how to get open sourced data, wrangle text into Python data structures with NLTK, and predict different classes of natural language with scikit-learn.

In Detail

There is an overflow of text data online nowadays. As a Python developer, you need to create a new solution using Natural Language Processing for your next project. Your colleagues depend on you to monetize gigabytes of unstructured text data. What do you do?

Hands-on NLP with NLTK and scikit-learn is the answer. This course puts you right on the spot, starting off with building a spam classifier in our first video. At the end of the course, you are going to walk away with three NLP applications: a spam filter, a topic classifier, and a sentiment analyzer. There is no need for fancy mathematical theory, just plain English explanations of core NLP concepts and how to apply those using Python libraries.

Taking this course will help you to precisely create new applications with Python and NLP. You will be able to build actual solutions backed by machine learning and NLP processing models with ease.

Audience

This course is for developers, data scientists, and programmers who want to learn about practical Natural Language Processing with Python in a hands-on way. Developers who have an upcoming project that needs NLP, or a pile of unstructured text data on their hands, and don't know what to do with it, will find this course useful. Prior programming experience with Python is assumed along with being comfortable dealing with machine learning terms such as supervised learning, regression, and classification. No prior Natural Language Processing or text mining experience is needed.

Publisher resources

Download Example Code

Chapter 1 : Working with Natural Language Data

The Course Overview

Use Python, NLTK, spaCy, and Scikit-learn to Build Your NLP Toolset

Reading a Simple Natural Language File into Memory

Split the Text into Individual Words with Regular Expression

Converting Words into Lists of Lower Case Tokens

Removing Uncommon Words and Stop Words

Chapter 2 : Spam Classification with an Email Dataset

Use an Open Source Dataset, and What Is the Enron Dataset

Loading the Enron Dataset into Memory

Tokenization, Lemmatization, and Stop Word Removal

Bag-of-Words Feature Extraction Process with Scikit-learn

Basic Spam Classification with NLTK’s Naive Bayes

Chapter 3 : Sentiment Analysis with a Movie Review Dataset

Understanding the Origin and Features of the Movie Review Dataset

Loading and Cleaning the Review Data

Preprocessing the Dataset to Remove Unwanted Words and Characters

Creating TF-IDF Weighted Natural Language Features

Basic Sentiment Analysis with Logistic Regression Model

Chapter 4 : Boosting the Performance of Your Models with N-grams

Deep Dive into Raw Tokens from the Movie Reviews

Advanced Cleaning of Tokens Using Python String Functions and Regex

Creating N-gram Features Using Scikit-learn

Experimenting with Advanced Scikit-learn Models Using the NLTK Wrapper

Building a Voting Model with Scikit-learn

Chapter 5 : Document Classification with a Newsgroup Dataset

Understanding the Origin and Features of the 20 Newsgroups Dataset

Loading the Newsgroup Data and Extracting Features

Building a Document Classification Pipeline

Creating a Performance Report of the Model on the Test Set

Finding Optimal Hyper-parameters Using Grid Search

Chapter 6 : Advanced Topic Modelling with TF-IDF, LSA, and SVMs

Building a Text Preprocessing Pipeline with NLTK

Creating Hashing Based Features from Natural Language

Classify Documents into 20 Topics with LSA

Document Classification with TF-IDF and SVMs

Start your Free Trial

Self paced

Go to the Course
We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.