Skip to Content
Course

Natural Language Processing

CSE-41344

The main goal of Natural Language Processing (NLP) is to comprehend the meaning of text semantics. 

Currently, there are two distinct approaches to NLP. In the first approach, the fundamental mathematical analysis of NLP will be covered.  Students will write Python code to access NLTK, TextBlob software packages, breaking down text into tokens.  The tokenization process will be covered using Regular Expressions, NLTK and TextBlob software.  Text tokens will be converted into vectors and vectorization process, including count vectorizer, cosine similarity computation and TF-IDF (Term Frequency Inverse Document Frequency) will be explored.  The semantics of text will be analyzed using Latent Semantic Analysis (LSA).
 
In the second approach, students will explore Machine/Deep Learning models for NLP.  Naïve Bayes machine learning model will be used for document classification.  Deep learning tools (Keras/TensorFlow) will be used to generate Word Embeddings like Word2Vec.  Transformers, including GPT1/2/3/4, BERT for semantic analysis of text and exploration of the ChatGPT Language model will be covered. Huggingface Transformer library will be used to explore various applications of NLP like Translation, Sentiment Analysis, etc.
 
NLP has become an integral part of Artificial Intelligence (AI), which is expected to drive the growth in the new economy.  Since its inception in November 2022 (ChatGPT) AI has consistently dominated the news cycle. ChatGPT has resonated globally, garnering over 1 million users in its first week. Virtually all news organizations routinely file reports about how ChatGPT is revolutionizing many real-world tasks that had been done by human workers.
This course is comprehensive, covering state-of-the-art tools and techniques of NLP.

Course Highlights:

  • Fundamental mathematical analysis of NLP
  • Tokenization using Regular Expressions
  • Vectorization of words: Count Vectorizer + TF-IDF
  • Cosine similarity between words
  • Understand the meaning of text using Latent Semantic Analysis
  • Machine Learning: Naïve Bayes for text Classification
  • Deep Learning: Word Embeddings Word2Vec
  • Deep Learning: Generating Neural Networks: GPT-1/2/3 (Generative Pre-Training)
  • Deep Learning: NLP Analysis for Search Engines: BERT – (Bi-Directional Encoder Representations Transformers)
  • ChatGPT

Course Learning Outcomes:

Upon successful completion of this course, students will be able to:
  • Understand Word2Vec and apply this concept in various real-world applications
  • Extract themes from documents using several different approaches
  • Use ChatGPT and understand how it was developed
  • Understand Huggingface Transformer library and use this library for various NLP applications
  • Learn what the future of Language Models is and its impact on society

Course Typically Offered: Online during the Winter and Summer academic quarters.

Software: Students will use Python to complete hands-on assignments. These tools are free and open-source.

Hardware: Students must have access to a web-enabled computer.

Prerequisites: CSE-40028 Introduction to Programming (Python) or equivalent knowledge and experience.

Next Step: After completing of this course, consider taking other courses in the Machine Learning Methods, or Python Programming certificates.

Contact: For more information about this course, please contact unex-techdata@ucsd.edu.

Course Information

Online
3.00 units
$775.00

Course sessions

Add To Cart

Section ID:

185668

Class type:

Online Asynchronous.

This course is entirely web-based and to be completed asynchronously between the published course start and end dates. Synchronous attendance is NOT required.
You will have access to your online course on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.

Textbooks:

All course materials are included unless otherwise stated.

Policies:

  • No refunds after: 1/20/2025

Schedule:

No information available at this time.
Add To Cart

Instructor: Ashok Pahwa, Ph.D.

Ashok Pahwa, Ph.D.

Founder, A+ Web Services

Ash Pahwa, Ph.D., is an educator, entrepreneur, and technology visionary with over 25 years of industry experience. He has founded several successful technology companies during his career. His most recent company is A+ Web Services which provides internet marketing and web analytics services. His expertise includes search engine optimization, web analytics, web programming, digital image processing, database management, digital video, and data storage technologies.

He developed cellAnalyst image analysis software for the Microsoft Windows/.NET platform. cellAnalyst is also available as a web service. He developed iVision, an image database management system for storage and retrieval of biomedical images based on metadata, annotation, and content. iVision was developed under a research grant from National Institute of Health. He also taught a Digital Image Processing course at University of California, Irvine in winter of 2007. He earned his Ph.D. in Computer Science from the Illinois Institute of Technology in Chicago. He is listed in Who's Who in the Frontiers of Science and Technology.

Full Bio