The main goal of Natural Language Processing (NLP) is to comprehend the meaning of text semantics.
Currently, there are two distinct approaches to NLP. In the first approach, the fundamental mathematical analysis of NLP will be covered. Students will write Python code to access NLTK, TextBlob software packages, breaking down text into tokens. The tokenization process will be covered using Regular Expressions, NLTK and TextBlob software. Text tokens will be converted into vectors and vectorization process, including count vectorizer, cosine similarity computation and TF-IDF (Term Frequency Inverse Document Frequency) will be explored. The semantics of text will be analyzed using Latent Semantic Analysis (LSA).
In the second approach, students will explore Machine/Deep Learning models for NLP. Naïve Bayes machine learning model will be used for document classification. Deep learning tools (Keras/TensorFlow) will be used to generate Word Embeddings like Word2Vec. Transformers, including GPT1/2/3/4, BERT for semantic analysis of text and exploration of the ChatGPT Language model will be covered. Huggingface Transformer library will be used to explore various applications of NLP like Translation, Sentiment Analysis, etc.
NLP has become an integral part of Artificial Intelligence (AI), which is expected to drive the growth in the new economy. Since its inception in November 2022 (ChatGPT) AI has consistently dominated the news cycle. ChatGPT has resonated globally, garnering over 1 million users in its first week. Virtually all news organizations routinely file reports about how ChatGPT is revolutionizing many real-world tasks that had been done by human workers.
This course is comprehensive, covering state-of-the-art tools and techniques of NLP.
Course Highlights:
- Fundamental mathematical analysis of NLP
- Tokenization using Regular Expressions
- Vectorization of words: Count Vectorizer + TF-IDF
- Cosine similarity between words
- Understand the meaning of text using Latent Semantic Analysis
- Machine Learning: Naïve Bayes for text Classification
- Deep Learning: Word Embeddings Word2Vec
- Deep Learning: Generating Neural Networks: GPT-1/2/3 (Generative Pre-Training)
- Deep Learning: NLP Analysis for Search Engines: BERT – (Bi-Directional Encoder Representations Transformers)
- ChatGPT
Course Learning Outcomes:
Upon successful completion of this course, students will be able to:
- Understand Word2Vec and apply this concept in various real-world applications
- Extract themes from documents using several different approaches
- Use ChatGPT and understand how it was developed
- Understand Huggingface Transformer library and use this library for various NLP applications
- Learn what the future of Language Models is and its impact on society
Course Typically Offered: Online during the Winter and Summer academic quarters.
Software: Students will use Python to complete hands-on assignments. These tools are free and open-source.
Hardware: Students must have access to a web-enabled computer.
Prerequisites: CSE-40028 Introduction to Programming (Python) or equivalent knowledge and experience.
Next Step: After completing of this course, consider taking other courses in the Machine Learning Methods, or Python Programming certificates.
Contact: For more information about this course, please contact unex-techdata@ucsd.edu.
Course Number: CSE-41344
Credit: 3.00 unit(s)
Related Certificate Programs: Machine Learning Methods
+ Expand All
-
7/9/2024 - 9/7/2024
$775
Online
-
-
-
CLASS TYPE:
Online Asynchronous.
This course is entirely web-based and to be completed asynchronously between the published course start and end dates. Synchronous attendance is NOT required.
You will have access to your online course on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.
Pahwa, Ashok, Founder, A+ Web Services
Ash Pahwa, Ph.D., is an educator, author, entrepreneur, and technology visionary with three decades of industry and academic experience. He has founded several successful technology companies during his career, the latest of which is A+ Web Services. Dr. Pahwa earned his doctorate in Computer Science from the Illinois Institute of Technology in Chicago. He is listed in Who's Who in the Frontiers of Science and Technology . He is also a Google Certified Analytics Consultant. His expertise includes search engine optimization, web analytics, web programming, digital image processing, database management, digital video, and data storage technologies. In Industry, Dr. Pahwa has worked for General Electric, AT&T Bell Laboratories, Xerox Corporation, and Oracle. He founded CD-Gen...Read More
-
TEXTBOOKS:
No information available at this time.
-
POLICIES:
No refunds after: 7/15/2024.
-
7/9/2024 - 9/7/2024
extensioncanvas.ucsd.edu
You will have access to your course materials on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.
There are no sections of this course currently scheduled. Please contact the Science & Technology department at 858-534-3229 or unex-sciencetech@ucsd.edu for information about when this course will be offered again.