Natural Language Processing With Python (NLPP)

 

Course Overview

The Natural Language Processing with Python training course is designed to teach participants the concepts of Natural Language Processing (NLP) and also to provide hands-on experience dealing with text data. The course will help participants to detect patterns in textual data using Python. The course begins with an overview of NLP and some key techniques in NLP. Next, students will write their own spam detection code and sentiment analysis code in Python. The course then looks at Deep Learning, RNNs, Attention Models, Sequence Models and working with BERT Models. The course concludes with a look at using BERT for Q&A Systems.

Purpose:

Promote an in-depth understanding on how to use Natural Language Processing in your Python applications.

Who should attend

Data Scientists and Machine Learning Engineers looking to incorporate Natural Language Processing into their Python applications.

Prerequisites

Participants should preferably have basic knowledge of Python and should be familiar with common ML algorithms like Logistic Regression, Random Forest, Support Vector Machines, Bayesian Classification etc.

Course Objectives

Upon completion of this course, you should be able to:

  • Explain what is Natural Language Processing
  • Access Text Corpora and Lexical Resources
  • Process raw text
  • Write structured programs
  • Categorize and tag words
  • Learn to classify and extract information from text
  • Analyze sentence structure and meaning
  • Build your own Spam Detector and Sentiment Analyzer
  • Write your own Article Spinner
  • Describe Deep Learning
  • Understand and use BERT

Outline: Natural Language Processing With Python (NLPP)

Natural Language Processing

  • What is Natural Language Processing?
  • The NLTK package
  • Preparing text for analysis
  • Text summarization
  • Text classification
  • Topic Modelling
  • Hands-on Exercise(s)

Accessing Text Corpora and Lexical Resources

  • Accessing Text Corpora
  • Conditional Frequency Distributions
  • More Python: Reusing Code
  • Lexical Resources
  • WordNet
  • Hands-on Exercise(s)

Processing Raw Text

  • Back to the Basics
  • Sequences
  • Questions of Style
  • Functions: The Foundation of Structured Programming
  • Doing More with Functions
  • Program Development
  • Algorithm Design
  • A Sample of Python Libraries
  • Exercises

Writing Structured Programs

Categorizing and Tagging Words

  • Using a Tagger
  • Tagged Corpora
  • Mapping Words to Properties Using Python Dictionaries
  • Automatic Tagging
  • N-Gram Tagging
  • Transformation-Based Tagging
  • How to Determine the Category of a Word
  • Exercises

Learning to Classify Text

  • Supervised Classification
  • Further Examples of Supervised Classification
  • Evaluation
  • Decision Trees
  • Naive Bayes Classifiers
  • Maximum Entropy Classifiers
  • Modeling Linguistic Patterns
  • Exercises

Extracting Information from Text

  • Information Extraction
  • Chunking
  • Developing and Evaluating Chunkers
  • Recursion in Linguistic Structure
  • Named Entity Recognition
  • Relation Extraction
  • Exercises

Analyzing Sentence Structure

  • Some Grammatical Dilemmas
  • What’s the Use of Syntax?
  • Context-Free Grammar
  • Parsing with Context-Free Grammar
  • Dependencies and Dependency Grammar
  • Grammar Development
  • Exercises

Building Feature-Based Grammars

  • Grammatical Features
  • Processing Feature Structures
  • Extending a Feature-Based Grammar
  • Exercises

Analyzing the Meaning of Sentences

  • Natural Language Understanding
  • Propositional Logic
  • First-Order Logic
  • The Semantics of English Sentences
  • Discourse Semantics
  • Exercises

Build your own Spam Detector

  • Build your own spam detector – description of data
  • Build your own spam detector using Naive Bayes and AdaBoost – the code
  • Key Takeaway from Spam Detection Exercise
  • Naive Bayes Concepts
  • AdaBoost Concepts
  • Other types of features
  • Spam Detection FAQ
  • What is a Vector?
  • SMS Spam Example
  • SMS Spam in Code

Build your own Sentiment Analyzer

  • Description of Sentiment Analyzer
  • Logistic Regression Review
  • Preprocessing: Tokenization
  • Preprocessing: Tokens to Vectors
  • Sentiment Analysis in Python using Logistic Regression
  • Sentiment Analysis Extension
  • How to Improve Sentiment Analysis & FAQ

Latent Semantic Analysis

  • Latent Semantic Analysis – What does it do?
  • SVD – The underlying math behind LSA
  • Latent Semantic Analysis in Python
  • What is Latent Semantic Analysis Used For?
  • Extending LSA

Write your own Article Spinner

  • Article Spinning Introduction and Markov Models
  • More about Language Models
  • Trigram Model
  • Precode Exercises
  • Writing an article spinner in Python
  • Article Spinner Extension Exercises

Introduction to Deep Learning

  • What is Deep Learning?
  • Deep Learning Architecture
  • Deep Learning Frameworks
  • The relationship between Deep Learning and Machine Learning
  • Deep Learning Use cases
  • Concepts and Terms
  • How to implement Deep Learning?
  • Pre-Trained ML Models

Recurrent Neural Networks

  • What are Recurrent Neural Networks?
  • Different types of RNNs
  • Language model and sequence generation
  • Sampling novel sequences
  • Vanishing gradients with RNNs
  • Gated Recurrent Unit (GRU)
  • Long Short Term Memory (LSTM)
  • Bidirectional RNN
  • Deep RNNs
  • Seq to Seq Models
  • Transformers
  • Attention Models
  • Hands-on Exercise(s)

Getting started with BERT

  • What is BERT?
  • Embeddings
  • Architecture

BERT's tokenizer

  • Understanding CNN for NLP
  • How to import Files
  • Cleaning Data & Tokenization
  • Model Building
  • Evaluation

Tuning BERT for Q&A System

  • Overview of Q&A System
  • Data Preprocessing
  • Understanding Model Layers
  • Building and Compiling Model
  • Key Params
  • Training
  • Evaluation
  • Conclusion

Prices & Delivery methods

Online Training

Duration
5 days

Price
  • on request
Classroom Training

Duration
5 days

Price
  • on request

Schedule

Currently there are no training dates scheduled for this course.