Course

Futures: Audio, Video, Image, and Augmented Reality Processing

EE-90009

Wireless networks have empowered smart cities and homes as well as work, gaming, and entertainment in the cloud. The powerful combination of media streaming and wireless networks further expands the reach of on-the-go entertainment. Media content such as audio and video are the most engaging source of information and entertainment. Many teenagers spend more than three times of their spare time watching TV and listening to music than on social media. The global footprint of live and on-demand media services has been expanded by the internet, which has opened up new ways for discovering, sharing, and consuming media content anywhere, anytime, and on any device. Conversely, it takes nothing more than a smartphone and a YouTube channel to become a global broadcaster and producer of live events. While wireless and internet technologies have brought humans closer together by allowing instant connectivity, recent advances in artificial intelligence (AI) will broaden that convergence by unifying computers and content to provide smart and predictive analysis, and even award-winning content far beyond the capabilities of the human brain.

This course will provide a detailed description of audio, video, image, and augmented reality processing. In addition to key audio coding standards, it will discuss the coding of video content in emerging fields, including high dynamic range (HDR), computer-generated screen content, and immersive applications such as omnidirectional (360-degree) video and augmented reality (AR). The course will also examine the role of AI in image and natural language processing. Interesting demos will be provided by the instructor.

Learning Outcomes:

Learn the architecture of popular and emerging audio codecs (AAC, SBC, LC3, EVS).
Understand the core modules of video codecs (AVC, HEVC, AV1, VVC) and development platforms (FFMPEG).
Understand the tradeoffs in coding efficiency and scene complexity for typical scenarios.
Describe adaptive streaming platforms (Apple live streaming, MPEG-DASH).
Review the essentials of immersive communications using AR.
Learn the basics of JPEG AI, natural language processing, speech recognition, and ChatGPT.

Course Information

Online

3.00 units

$395.00

Course sessions

Add To Cart 6/30/2025 - 7/20/2025 Online $395

Section ID:

189207

Class type:

Online Asynchronous.

This course is entirely web-based and to be completed asynchronously between the published course start and end dates. Synchronous attendance is NOT required.
You will have access to your online course on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.

Textbooks:

All course materials are included unless otherwise stated.

Policies:

No refunds after: 6/23/2025
Early enrollment advised
No UCSD parking permit required
No visitors permitted
Pre-enrollment required

Schedule:

No information available at this time.

Add To Cart