The need for significant reduction in data rate for wide-band digital audio signal transmission and storage has led to the development of psychoacoustics-based data compression techniques. In this approach, the limitations of human hearing are exploited to remove inaudible components of audio signals. The degree of bit rate reduction achievable without sacrificing perceived quality using these methods greatly exceeds that possible using lossless techniques alone. Perceptual audio coders are currently used in many applications including Digital Radio and Television, Digital Sound on Film, and Multimedia/Internet Audio.
In this course, the basic principles of perceptual audio coding will be reviewed. Current and future applications (e.g. AC-3, MPEG) will be presented. In-class demonstrations will allow students to hear the quality of state-of-the-art implementations at varying data rates and they will be required to program their own simple perceptual audio coder during the course.
Below is a tentative schedule, subject to update. Required readings from the course textbook are referenced for each week. As a general rule, readings should always be done according to topics and prior to class. Unless otherwise specified, the class will meet in the CCRMA Classroom on the 2nd floor of the Knoll (map) on Friday afternoons from 2:15 p.m. until 4:05 p.m. The two exceptions are hilighted below.
Date | Topic | Reading | Due |
---|---|---|---|
1/9 | Course Overview and Audio Signal Representation | Chapters 1 and 3 | |
1/16 | Quantization | Chapter 2 | HW1 |
1/23 | Time to Frequency Mapping | Chapter 4 and 5 | HW2 |
1/30 | Introduction to Psychoacoustics | Chapters 6 and 7 | HW3 |
2/6 | Bit Allocation and Basic Building Blocks of an Audio Codec | Chapters 8 and 9 | HW4 |
2/12 | Audio Codecs Evaluation NOTE: 2:30 pm–5:30 pm will be held at Dolby Labs, San Francisco |
Chapter 10 | HW5 |
2/20 | Overview of MPEG and MPEG-1 Audio Coding | Chapter 11 | HW6 (Project Proposal) and HW7 |
2/27 | Overview of MPEG-2 and MPEG-4 Audio Coding | Chapters 12, 13, 15 | |
3/6 | Overview of Other Coding Standards (AC-3, etc.) | Chapter 14 | Project Due 5pm |
3/9 | Project Presentations NOTE: This class is held on Monday |
There is one required textbook for the class:
M. Bosi & R.E. Goldberg, "Introduction to Digital Audio Coding and
Standards", Springer, 2003, ISBN: 978-1-4020-7357-1.
(Publisher website:
http://www.springer.com/engineering/signals/book/978-1-4020-7357-1)
The main course website is accessed through CourseWork, Stanford University's learning management system. All homework assignments, grades, and supplementary materials are accessible via the CourseWork site, and all homework submissions must take place via your CourseWork dropbox. If you have any trouble accessing the site, please contact Marina Bosi.
Marina Bosi weekly office hours will be announced in class
hwX_suid.zip
(or hwX_suid.tar
or
hwX_suid.tar.gz
)
where X
is the homework number and suid
is your Stanford ID. For example, hw1_tsobrien.zip
This course includes a final project. The final project consists of the design and implementation of a simple perceptual audio coder. Groups of up to three students typically work together on the final project.
Requirements for the final project include a written proposal (one page) by the fifth week of the quarter (2/20), a written report by the eighth week of the quarter (3/6), and a presentation of the report by the end of the quarter (3/9). The aim of the report should be to fully document project methodology and results.
Students may use the computer of their choice for the project, but Python 2.7 (http://www.python.org/) is the preferred programming language for implementation of the project coder. (Previous Python programming experience is neither required nor expected for this course.)
Please do not hesitate to contact Marina Bosi or Jorge Herrera, with any questions or concerns.