Get Started with Natural Language Processing Using Python, Spark, and Scala
Video description
Whether you’re a programmer with little to no knowledge of Python, or an experienced data scientist or engineer, this course will walk you through natural language processing, using both Python and Scala, and show you how to implement a range of popular tools including Spark, scikit-learn, SpaCy, NLTK, and gensim for text mining.
You’ll learn the most common techniques for processing text, how to use …
Get Started with Natural Language Processing Using Python, Spark, and Scala
Video description
Whether you’re a programmer with little to no knowledge of Python, or an experienced data scientist or engineer, this course will walk you through natural language processing, using both Python and Scala, and show you how to implement a range of popular tools including Spark, scikit-learn, SpaCy, NLTK, and gensim for text mining.
You’ll learn the most common techniques for processing text, how to use machine learning to generate annotators and apply them within a data pipeline, and the differences between NLP pipelines and other approaches to semantic text mining. You’ll learn about standard UIMA annotators, custom annotators, and machine-learned annotators, and understand how architectures for text processing pipelines can incorporate some of the most popular big data tools such as Kafka, Spark, SparkSQL, Cassandra, and ElasticSearch.
By the end of the course, you will be able to build a natural language processing and entity extraction pipeline, and will have a complete understanding of the capabilities and limitations of natural language text processing.
Materials or downloads needed in advance: Example files
Getting Started: Basic String Processing In Python
String Operations
Working With Unicode
Converting Text To Symbols: Tokenization In NLTK and spaCy
Splitting Documents
Splitting Sentences
Filtering Stop Words
Going Subsymbolic: Vector Representations
tf-idf Gensim
Word Vectors
Google Word Vectors
Learn Word Vectors
Finding The Structure Of Text: Parsing In spaCy
Dependency Parsing
Sentence Head
Named Entities
Determining How The Writer Feels: Sentiment Analysis In VADER
Sentiment Analysis Intro
Sentiment In VADER
Making Decisions: Text Classification
Text Classification Intro
Classification With TextBlob
Classification With scikit-learn
Indentifying Discussed Topics: LDA In Gensim
LDA Introduction
LDA Gensim
LDA pyLDAvis
Toward Machine Reading: Entity Extraction And Linking
Entity Linking
pyspotlight
FRED
Conclusion
Conclusion
Part 1: Introduction
Welcome to the Course
Natural Language Understanding in Examples
Part 2: NLP Pipelines
Building an NLP Pipeline
Part 3 - Annotators
Commonly Used Annotators
Detecting Positive, Negative Speculative Polarity
Machine Learned Annotators
Part 4: Custom Annotators
NLP Pipelines are Domain Specific
Unified Medical Language System (UMLS)
Coding Custom Annotators
Part 5: Machine Learned Annotators
Training Using Machine Learned Annotators
Part 6: Ontology Enrichment
The Need for Learned and Updated Ontologies
Learning New Medical Concepts and Relationships
Part 7: Architecture
An End-to-End Reference Architecture
Spark, SparkSQL, Cassandra Workflow
ElasticSearch SparkSQL
Part 8: Parting Advice
Language is Source and Domain-Specific
Welcome to the Course
Part 1: Building a natural language processing and entity extraction pipeline on Scala Spark
Notebook 1: Introduction
Annotation Library
Basic Annotators
Vocabulary Analysis
Exercise: Building a stopword annotator
Part 2: Machine Learning Applications for Statistical Natural Language Understanding at Scale
Notebook 2: Introduction
Model-based Annotators
Creating a Binary Classifier
Exercise: Predicting score or popularity
Part 3: Topic Modeling on Natural Language with Scala, Spark and MLLib
Notebook 3: Introduction
K-Means clustering
LDA topic modeling
Exercise: Using topics for score or popularity prediction
Part 4: Deep Learning Applications for Natural Language Understanding with Scala, Spark and MLLib
Notebook 4: Introduction
Word2Vec
Expanding genre entity lists
Exercise: Using Word2Vec based features for score or popularity prediction
Start your Free Trial Self paced Go to the Course We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.
This site uses cookies. By continuing to use this website, you agree to their use.I Accept