1 |
Tue |
Aug 27 |
Introduction |
Zhiyao Duan (Zoom) |
Lyon: Machine Hearing: An Emerging Field |
|
Course Project;
HW0 |
|
1 |
Thu |
Aug 29 |
Auditory Scene Analysis |
Zhiyao Duan (Zoom) |
Bregman: ASA Book Chapter 1 |
Wang & Brown: CASA Book Chapter 1 |
HW1 |
|
2 |
Tue |
Sep 3 |
Signal Processing Review |
Huiran Yu |
Mueller: Fundamentals of Music Processing Book Chapter 2 |
|
|
|
2 |
Thu |
Sep 5 |
Python Programming for Audio |
Huiran Yu |
librosa;
Audio Input Representations |
MIR with Python;
Python for Scientific Audio |
|
|
|
3 |
Tue |
Sep 10 |
Single Pitch Detection |
Zhiyao Duan |
Cheveigne: CASA Book Chapter 2.1-2.3 |
Cheveigne & Kawahara: YIN;
Kim et al: CREPE |
HW2 |
HW1 |
3 |
Thu |
Sep 12 |
Human Auditory Sensation |
Zhiyao Duan |
Yost: Hearing Book Chapter 11 |
Patterson: Auditory Images;
Lyon et al: Sparse Auditory Representations |
|
|
4 |
Tue |
Sep 17 |
Human Auditory Sensation |
Zhiyao Duan |
Yost: Hearing Book Chapter 13 |
Shamma: Encoding Sound Timbre in the Auditory System
Wang & Shamma: Spectral Shape Analysis |
|
|
4 |
Thu |
Sep 19 |
Rhythm Analysis |
Zhiyao Duan |
Mueller: Fundamentals of Music Processing Book Chapter 6
Ellis: Beat Tracking by Dynamic Programming |
Klapuri et al: Meter Analysis
Heydari et al: BeatNet |
HW3 |
HW2 |
5 |
Tue |
Sep 24 |
Timbre Representation |
Zhiyao Duan |
Herrera-Boyer et al: Signal Processing Methods for Music Transcription Book Chapter 6 |
Childers et al.: The Cepstrum;
Davis & Mermelstein: MFCC |
|
|
5 |
Thu |
Sep 26 |
Timbre Representation |
Zhiyao Duan |
Tzanetakis: Music Data Mining Book Chapter 2 |
Hermansky: PLP;
Hermansky & Morgan: RASTA |
|
|
6 |
Tue |
Oct 1 |
NMF Audio Modeling |
Zhiyao Duan |
Smaragdis & Brown: NMF Polyphonic Music Transcription |
Lee & Seung: NMF |
HW4 |
HW3 |
6 |
Thu |
Oct 3 |
More on NMF |
Zhiyao Duan |
Smaragdis et al.: PLCA |
Virtanen: Monaural Sound Source Separation |
|
|
7 |
Tue |
Oct 8 |
HMM Audio Modeling |
Zhiyao Duan |
Rabiner: HMM |
Mysore: PhD Thesis Chapter 2 |
|
|
7 |
Thu |
Oct 10 |
Deep Learning for Audio
CIRC Intro; Bluehive Cheat Sheet |
Zhiyao Duan |
Goodfellow et al.: Deep Learning Book Chapter 6 |
Hinton et al.: DNN for Speech Recognition; |
HW5 |
HW4 |
8 |
Tue |
Oct 15 |
NO CLASS: Fall Break |
|
How to write a paper?
How to give a talk?
How to make a poster? |
|
|
|
8 |
Thu |
Oct 17 |
Deep Learning Implementation
PyTorch 101
|
Xingjian Du |
Goodfellow et al.: Deep Learning Book Chapter 9 |
DNN for Speech Separation;
Huang et al: Singing Voice Separation by RNN |
|
Project Proposal |
9 |
Tue |
Oct 22 |
BP derivation |
Zhiyao Duan |
Goodfellow et al.: Deep Learning Book Chapter 14 |
Schluter & Bock: Onset Detection by CNN;
Hamel & Eck: Music Feature Learning with DBN;
|
|
|
9 |
Thu |
Oct 24 |
Speech Technology |
Zhiyao Duan |
Ravanelli et al: SpeechBrain, Park et al: Review of Speaker Diarization
ASVSpoof2019 |
Extended Reading
Kassis & Hengartner: Breaking Voice Authentication |
|
|
10 |
Tue |
Oct 29 |
Multi-pitch Analysis |
Frank Cwitkowitz |
Cheveigne: CASA Book Chapter 2 |
Klapuri: Harmonicity and Spectral Smoothness
Duan et al: Peak and Non-peak Region |
|
HW5 |
10 |
Thu |
Oct 31 |
Multi-pitch Analysis |
Zhiyao Duan |
Duan et al: Multi-pitch Streaming |
Poliner & Ellis: Discriminative Model;
Sigtia et al.: Neural Network for Piano Transcription |
HW6 |
|
11 |
Tue |
Nov 5 |
Score-Informed Source Separation |
Zhiyao Duan |
Dannenberg & Raphael: Alignment and Accompaniment;
Ewert et al: SISS Overview |
Ewert & Muller: Score-informed NMF;
Duan et al: Soundprism |
|
|
11 |
Thu |
Nov 7 |
Interactive Music Systems |
Zhiyao Duan |
Gifford et al.: Computational Systems for Music Improvisation |
Tatar & Pasquier: Music Agents |
|
|
12 |
Tue |
Nov 12 |
Voice Conversion |
Melissa Chen |
Sisman et al.: Overview |
Qian et al.: AutoVC; Sun et al: PPG; Li et al.: StarGANv2-VC |
|
HW6 |
12 |
Thu |
Nov 14 |
Room Acoustics and Spatial Audio |
Neil Zhang |
Machine Learning in Acoustics
Neural IIR Filter Field
|
Novel-View Acoustic Synthesis
HRTF Estimation in the Wild
|
|
|
13 |
Tue |
Nov 19 |
Multi-channel Source Localization and Separation |
Zhiyao Duan |
Stern et al: CASA Book Chapter 5;
Yilmaz & Rickard: DUET |
Woodruff & Wang: Binaural Localization Reverberant Noisy |
|
|
13 |
Thu |
Nov 21 |
Audio-Visual Scene Understanding |
Zhiyao Duan |
Arandjelovic & Zisserman: Objects that Sound; Owens & Efros: AV Scene Analysis |
Arandjelovic & Zisserman: Look Listen and Learn; Zhao et al.: Sounds of Pixels |
|
|
14 |
Tue |
Nov 26 |
Project Status Update |
Students |
|
|
|
|
14 |
Thu |
Nov 28 |
NO CLASS: Happy Thanksgiving!
| |
|
|
|
|
15 |
Tue |
Dec 3 |
Self Supervised Learning for Music Understanding |
Frank Cwitkowitz |
|
|
|
|
15 |
Thu |
Dec 5 |
Music Generation |
Moji Heydari |
Benetatos et al.: BachDuet; Dhawiwal et al.: Jukebox |
Hadjeres et al.: DeepBach; Roberts et al.: Hierarchical Latent Vector Model; Jaques et al.: Generating Music with Reinforcement Learning; |
|
|
16 |
Sun |
Dec 15 |
Project Oral Presentations |
Students |
7:15-10:15 PM at CSB 601 |
|
|
Project Report Final;
Slides Final |