1 |
Thu |
Aug 31 |
Introduction |
Zhiyao Duan |
Lyon: Machine Hearing: An Emerging Field |
|
Paper Review;
Course Project;
HW0 |
|
2 |
Tue |
Sep 5 |
Auditory Scene Analysis |
Zhiyao Duan |
Bregman: ASA Book Chapter 1 |
Wang & Brown: CASA Book Chapter 1 |
HW1 |
|
2 |
Thu |
Sep 7 |
Signal Processing Review |
Zhiyao Duan |
Mueller: Fundamentals of Music Processing Book Chapter 2 |
|
|
|
3 |
Tue |
Sep 12 |
Human Auditory Sensation |
Zhiyao Duan |
Yost: Hearing Book Chapter 11 |
Patterson: Auditory Images;
Lyon et al: Sparse Auditory Representations |
|
|
3 |
Thu |
Sep 14 |
Human Auditory Sensation |
Zhiyao Duan |
Yost: Hearing Book Chapter 13 |
Shamma: Encoding Sound Timbre in the Auditory System
Wang & Shamma: Spectral Shape Analysis |
|
HW1 |
4 |
Tue |
Sep 19 |
Single Pitch Detection |
Zhiyao Duan |
Cheveigne: CASA Book Chapter 2.1-2.3 |
Cheveigne & Kawahara: YIN;
Boersma: Praat |
HW2 |
Paper Review Batch 1 |
4 |
Thu |
Sep 21 |
Python Programming for Audio |
Neil Zhang |
libraso;
Audio Input Representations |
MIR with Python;
Python for Scientific Audio |
|
|
|
5 |
Tue |
Sep 26 |
Rhythm Analysis |
Zhiyao Duan |
Mueller: Fundamentals of Music Processing Book Chapter 6 |
Ellis: Beat Tracking by Dynamic Programming
Klapuri et al: Meter Analysis; |
HW3 |
|
5 |
Thu |
Sep 28 |
Timbre Representation |
Zhiyao Duan |
Herrera-Boyer et al: Signal Processing Methods for Music Transcription Book Chapter 6 |
Childers et al.: The Cepstrum;
Davis & Mermelstein: MFCC |
|
HW2 |
6 |
Tue |
Oct 3 |
Timbre Representation |
Zhiyao Duan |
Tzanetakis: Music Data Mining Book Chapter 2 |
Hermansky: PLP;
Hermansky & Morgan: RASTA |
|
|
6 |
Thu |
Oct 5 |
NMF Audio Modeling |
Zhiyao Duan |
Smaragdis & Brown: NMF Polyphonic Music Transcription |
Lee & Seung: NMF |
HW4 |
HW3 |
7 |
Tue |
Oct 10 |
More on NMF |
Zhiyao Duan |
Smaragdis et al.: PLCA |
Virtanen: Monaural Sound Source Separation |
|
|
7 |
Thu |
Oct 12 |
HMM Audio Modeling |
Zhiyao Duan |
Rabiner: HMM |
Mysore: PhD Thesis Chapter 2 |
|
Paper Review Batch 2 |
8 |
Tue |
Oct 17 |
NO CLASS: Fall Break |
|
How to write a paper?
How to give a talk?
How to make a poster? |
|
|
|
8 |
Thu |
Oct 19 |
Deep Learning for Audio
CIRC Intro; Bluehive Cheat Sheet |
Zhiyao Duan |
Goodfellow et al.: Deep Learning Book Chapter 6 |
Hinton et al.: DNN for Speech Recognition; |
HW5 |
HW4 |
9 |
Tue |
Oct 24 |
Deep Learning Implementation
|
Moji Heydari |
Goodfellow et al.: Deep Learning Book Chapter 9 |
DNN for Speech Separation;
Huang et al: Singing Voice Separation by RNN |
|
Project Proposal |
9 |
Thu |
Oct 26 |
BP derivation |
Zhiyao Duan |
Goodfellow et al.: Deep Learning Book Chapter 14 |
Schluter & Bock: Onset Detection by CNN;
Hamel & Eck: Music Feature Learning with DBN;
|
|
|
10 |
Tue |
Oct 31 |
Multi-pitch Analysis |
Zhiyao Duan |
Cheveigne: CASA Book Chapter 2 |
Klapuri: Harmonicity and Spectral Smoothness
Duan et al: Peak and Non-peak Region |
|
|
10 |
Thu |
Nov 2 |
Multi-pitch Analysis |
Zhiyao Duan |
Duan et al: Multi-pitch Streaming |
Poliner & Ellis: Discriminative Model;
Sigtia et al.: Neural Network for Piano Transcription |
|
HW5 |
11 |
Tue |
Nov 7 |
Speech Technology |
Neil Zhang |
Ravanelli et al: SpeechBrain, Park et al: Review of Speaker Diarization |
Extended Reading |
HW6 |
Paper Review Batch 3 |
11 |
Thu |
Nov 9 |
Speech Anti-spoofing |
Neil Zhang |
ASVSpoof2019 |
Ding et al: SAMO, Kassis & Hengartner: Breaking Voice Authentication |
|
|
12 |
Tue |
Nov 14 |
Voice Conversion |
Melissa Chen |
Sisman et al.: Overview |
Qian et al.: AutoVC; Sun et al: PPG; Li et al.: StarGANv2-VC |
|
|
12 |
Thu |
Nov 16 |
Audio Captioning |
Dimitra Emmanouilidou |
|
|
|
HW6 |
13 |
Tue |
Nov 21 |
Score-Informed Source Separation |
Zhiyao Duan |
Dannenberg & Raphael: Alignment and Accompaniment;
Ewert et al: SISS Overview |
Ewert & Muller: Score-informed NMF;
Duan et al: Soundprism |
|
Paper Review Batch 4 |
13 |
Thu |
Nov 23 |
NO CLASS: Happy Thanksgiving!
| |
|
|
|
|
14 |
Tue |
Nov 28 |
Source Separation |
Zhiyao Duan |
Hershey et al.: Deep Clustering |
|
|
Project Status Update (Mon) |
14 |
Thu |
Nov 30 |
Interactive Music Systems |
Zhiyao Duan |
Gifford et al.: Computational Systems for Music Improvisation |
Tatar & Pasquier: Music Agents |
|
|
15 |
Tue |
Dec 5 |
Audio-Visual Scene Understanding |
Zhiyao Duan |
Arandjelovic & Zisserman: Objects that Sound; Owens & Efros: AV Scene Analysis |
Arandjelovic & Zisserman: Look Listen and Learn; Zhao et al.: Sounds of Pixels |
|
Project Report Draft |
15 |
Thu |
Dec 7 |
Multi-channel Source Localization and Separation |
Zhiyao Duan |
Stern et al: CASA Book Chapter 5;
Yilmaz & Rickard: DUET |
Woodruff & Wang: Binaural Localization Reverberant Noisy |
|
Peer Review (due Fri) |
16 |
Tue |
Dec 12 |
Music Generation |
Zhiyao Duan |
Benetatos et al.: BachDuet; Dhawiwal et al.: Jukebox |
Hadjeres et al.: DeepBach; Roberts et al.: Hierarchical Latent Vector Model; Jaques et al.: Generating Music with Reinforcement Learning; |
|
|
16 |
Wed |
Dec 13 |
Project Oral Presentations |
Students |
CSB 523 from 11:45 to 1:45 |
|
|
Project Report Final;
Slides Final |