Course:
Instructors: Jimmy Ba and Mark Ebden
Section1: M 10 – 11 (BA1130) TH 9 – 11 (BA1130)
Section2: M 11 – 12 (RS211) TH 12 – 14 (LM161)
Office hours: Jimmy: M 1112 in BA4161 Mark: TH 1516 in SS6026C
Tutorials: F 10 – 12 (BA2155) M 12 – 14 (BA2155) F 16 – 18 (BA2155) F 09 – 11 (GB248) (merged TUT0104 and TUT0105)
Instructors won’t stick strictly to teaching a given section. For example, on Thursday 19 January Mark Ebden will teach both sections (1 and 2), and in future sometimes Jimmy Ba will teach both. This will occur regularly.
Course Piazza and Course Syllabus
Note that the time of the midterm has changed from what the syllabus initially said. It is now to begin in the evening but still lasting about 90 minutes.
Announcements:

Apr 18, 6:45pm: We will hold the following office hours to help preparing for the final exam. The final exam is scheduled for Thursday April 20th 9:30am 12:00 noon
 Apr 16, 4:29pm: We will hold the following office hours to help preparing for the final exam. The final exam is scheduled for Thursday April 20th 9:30am 12:00 noon
 Monday April 17th, BA4161
 12:00 1:00 pm: Kaustav, Tony
 3:004:00 pm: Jimmy
 Tuesday April 18th, BA4161
 12:00 1:00 pm: Renjie, Eleni
 Wednesday April 19th, BA4161
 12:00 1:00 pm: Shenlong, Renjie

Mar 27, 1:34pm: Edit: The assignment 4 handout is posted on the course website. The due date is April 9th midnight, 2017. You have two weeks to complete this assignment. There are a few practice problems included in the A4 handout that will not be graded. You do not need to include the solutions for the practice questions in your final report. These practice problems may be helpful for preparing the final exam on the 20th of April.

Mar 8, 12:45pm: The assignment 3 pdf handout and the related dataset is out on the course website. The due date is Mar 24th midnight, 2017. You have more than two weeks to complete this assignment.

Feb 13, 3:27pm: Skule has implemented a new ongoing annoynemous feedback system SpeakUp!. Instead of filling out the monthly early course evaluation in class, you can also submit your feedbacks on a rolling basis through the SpeakUp! webpage and your class rep will contact the course instructors to discuss the issues.

Feb 10, 1:18am: The midterm questions from 2016 is posted. And you can find the midterm cheatsheet template on this page that you may enter information on both sides of the aid sheet, without restriction. The cheatsheet should be printed on 8.5″ x 11″ paper.

Feb 9, 11:37pm: The assignment 2 pdf handout and the related dataset is out on the course website. The due date is Feb 27th midnight, 2017. You have more than two weeks to complete this assignment.

Feb 6, 3:00pm: Typo correction for the bonus question 1.4.1 of the assignment 1, the \lambda should be set to 100 instead when you report the simulation results. You should see a smooth prediction from your Gaussian process regression model.

Feb 6, 9:27am: We are trying to recruit a volunteer notetaker for the ECE521H1 section 0101 for students registered with Accessibility Service. Email as.notetaking@utoronto.ca if you have questions or require any assistance. Thanks.

Jan 25, 2:29pm: I apologize for the multiple delays and the assignment 1 handout is finally out. We have spent quite sometime designing the exercises in the assignment and we hope you may find it rewarding to solve them. The pdf handout and the related dataset can be downloaded from the calendar section at this course website. The due date is Feb 7 midnight, 2017. You have two weeks to complete this assignment. There is a bonus part in the assignment that is roughly equal to 30% of assignment 1. The bonus marks can be used towards the final grade of this course.

Jan 16, 6:18pm: The Friday morning 911 tutorial sections TUT0104 and TUT0105 are now merged and will be held in GB248 from now on.

Jan 13, 1:20am: The first tutorials start today. There is a new tutorial section TUT0105 for the students who could not find tutorial space before. The TUT0105 tutorial time this week is Fri 911 am in BA1240. The tutorial room may change in the following weeks. Stay tuned. The first week lecture slides are also posted below.
 Jan 09, 8:23am: Welcome to ECE521! Please take a moment to enroll the course Piazza. Piazza will be the main communication channel to contact the instructors for this course. The first tutorials start on Friday.
Course Overview:
The twentyfirst century has seen a series of breakthroughs in statistical machine learning and inference algorithms that allow us to solve many of the most challenging scientific and engineering problems in artificial intelligence, selfdriving vehicles, robotics and DNA sequence analysis. In the past few years, machine learning applications in search engines, wearable devices and social networks have broadly impacted our daily life. These algorithms adapt to the data at hand and are tolerant to noisy observations. The goal of this course is to provide principled mathematical tools to solve statistical inference problems you may encounter later. The first half of the course covers the fundamentals of statistical machine learning and supervised learning models. The second half of the course focuses on probabilistic inference and unsupervised learning. The examples of the course include object recognition; image search, document retrieval; sequence filtering and alignment; and data compression. This course reviews stateoftheart algorithms and models for probabilistic inference and machine learning.
Teaching Assistants:
Min Bai
Sinisa Colic
Kaustav Kundu
Renjie Liao
Mengye Ren
Eleni Triantafillou
Shenlong Wang
Yuhuai Wu
Tianrui Xiao
Haiyan Xu
TA Contact: Please post your questions related to tutorials, assignments and exams on Piazza. Do NOT send emails about the class to the personal emails of the TAs directly. We will not answer.
Calendar:
Optional reading  Topic  

Week 1  
Lecture: Monday, Jan 09  Introduction pdf slides  
Lecture: Thursday, Jan 12  Review of probability Fundamentals of machine learning pdf slides  
Tutorial  Review of linear algebra Introduction to TensorFlow pdf slides Tutorial examples as an IPython Notebook. Take a look at here on how to install Jupyter/IPython Notebook.  
Week 2  
Lecture: Monday, Jan 16  The curse of dimensionality: Bishop 2006, Chap. 1.4 KNN: Bishop 2006, Chap. 2.5.2 (free) KNN and linear regression: Hastie et al 2013, Chap. 2.3 (free) Convex function and Jensen inequality: MacKay 2003, Chap. 2.7 (free) Gradient descent: Goodfellow et al 2016, Chap. 4.3  Example: K Nearest Neighbours Optimization pdf slides 
Lecture: Thursday, Jan 19  Stochastic gradient descent, Léon Bottou The momentum method: Coursera video: Neural Networks for Machine Learning Lecture 6.3 Maximum likelihood for a Gaussian: MacKay 2003, Chap. 22.1 Maximum likelihhod estimation of a classifier: Hastie et al 2013, Chap. 2.6.3 Regularization: Goodfellow et al 2016, Chap. 7.1 Regularization through data augmentation: Goodfellow et al 2016, Chap. 7.4  Maximum likelihood estimation (MLE)] Optimization and regularization pdf slides 
Tutorial  Tricks to improve SGD “Tuning/debugging” optimizer Multivariate Gaussian Underfitting vs. overfitting pdf slides  
Week 3  
Lecture: Monday, Jan 23  Probabilistic interpretation of linear regression MLE vs. MAP Optimal regressor [pdf slides] (see the joint slide deck from Jan 26)  
Assignment 1: Wednesday, Jan 25  Due date: Feb 7 midnight, 2017 kNN, Gaussian process (bonus), linear regression  Assignment handout Download Tiny MNIST dataset here Histogram of results 
Lecture: Thursday, Jan 26  Regression and decision theory: Bishop 2006, Chap. 1.5 Biasvariance tradeoff: Bishop 2006, Chap. 3.2  Optimal regressor Feature expansion Decision theory joint pdf slides 
Tutorial  kNN, Linear regression Gaussian process regression Training, validation and test set pdf slides  
Week 4  
Lecture: Monday, Jan 30  Recap decision theory Logistic regression Neural networks pdf slides  
Lecture: Thursday, Feb 2  Neural networks: Bishop 2006, Chap. 5, MacKay 2003, Chap. 3940, 44, Hastie et al 2013, Chap. 11  Neural networks Backpropagation pdf slides 
Tutorial  Logistic regression Backpropagation examples pdf slides  
Week 5  
Lecture: Monday, Feb 6  Multiclass classification Learning feedforward neural networks pdf slides  
Lecture: Thursday, Feb 9  Convolutional neural networks: cs231n course slides Transfer learning and finetuning: cs231n course slides  Bagoftricks for deep neural networks Types of neural networks: convolutional neural networks, recurrent neural networks pdf slides 
Assignment 2: Thursday, Feb 9  Due date: Feb 27 midnight, 2017 logistic regression, neural networks  Assignment handout (updated Feb 18th) Download notMNIST dataset here Histogram of results 
Tutorial  Sample midterm review Assignment 1 postmortem  
Week 6  
Lecture: Mon, Feb 13  Bishop 9.1 and 12.1  kmeans clustering, dimensionality reduction 
Study: Thu, Feb 16  Study independently in the classroom, with instructor on hand for questions. Unstructured.  
Midterm: Thursday, Feb 16  Time: 6:207:50 pm.  Sample midterm from 2016 Midterm cheatsheet template Histogram of results 
Tutorial  Midterm exam postmortem  
Week 7  
Lecture: Mon, Feb 27  Bishop 3.3 Murphy 2012: parts of chap. 5 & sec. 7.6  PCA continued, Bayesian methods 
Lecture: Thu, Mar 2  Bishop 1.2.6 (Bayesian prediction), 1.3 (model selection), 2.4.2 (conjugate prior)  Bayesian learning continued 
Tutorial  Examples of PCA, kMeans Bayesian predictive distribution Bayesian model comparison pdf slides  
Week 8  
Lecture: Mon, Mar 6  Mixture of Gaussians: Bishop 9.2 EM algorithm: Bishop 9.3  Mixture models, EM algorithm 
Assignment 3: Wed, Mar 8  Due date: March 24 midnight, 2017 Unsupervised learning, probablistic models  Assignment handout (updated Mar 13th) Download the datasets: data2D, data100D, tinymnist Download the utility function here 
Lecture: Thu, Mar 9  Naive Bayes: Hastie et al 2013, Chap. 6.6.3 Bayesian network: Bishop 8.1, 8.2  Mixture of Gaussians, Naive Bayes and Bayesian Networks 
Tutorial  Introducing A3 Examples of Mixture of Bernoullis EM algorithm pdf slides  
Week 9  
Lecture: Mon, Mar 13  Bishop 8.1 & 8.2 Parts of Murphy Ch. 10 Russell and Norvig 2009 (AI: A Modern Approach) parts of Ch. 14  Bayesian networks continued 
Lecture: Thu, Mar 16  Bishop 8.3, 8.4.3  Markov Random Fields, factor graphs 
Tutorial  Review of graphical models Conversion between BN, MRF and FG Inference in graphical models pdf slides  
Week 10  
Lecture: Mon, Mar 20  Russell & Norvig 15.1 Parts of Bishop Chap. 13  Sequence models 
Lecture: Thu, Mar 23  Parts of Russell & Norvig 15.3 Parts of Bishop Chap. 13  Hidden Markov Models (HMMs) 
Tutorial  Review of Markov models Examples of inference in graphical models pdf slides  
Week 11  
Lecture: Mon, Mar 27  Murphy 17.4 End of Russell & Norvig 15.2 Bishop 13.2.5  HMM inference/learning 
Assignment 4: Mon, Mar 27  Due date: April 9th midnight, 2017 Graphical models, sumproduct algorithm  Assignment handout (updated) 
Lecture: Thu, Mar 30  Parts of MacKay Chapter 16 and Sections 26.126.2 Bishop 8.4.4 Kschischang, Frey and Loeliger: Factor Graphs and the SumProduct Algorithm Section 2 Frey: Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models Section 2  Messagepassing algorithms (updated notation) 
Tutorial  Forwardbackward algorithm The sumproduct algorithm pdf slides  
Week 12  
Lecture: Mon, Apr 3  Murphy 17.4.4 & 20.2 Bishop 8.4.5 & 13.2.5 MacKay 26.3  Maxsum algorithm 
Lecture: Thu, Apr 6  MacKay Chapters 16 and 26 Bishop 8.4.7 LBP: MacKay 26.4, Bishop 8.4.7  Junctiontree algorithm, Loopy belief propagation 
Tutorial  Review pdf slides  
Week 13  
Lecture: Mon, Apr 10  The first section of Murphy 19.6  Supervised Learning using Graphical Models Discriminative Approach Conditional Random Fields (CRFs) Combining Deep Learning with Graphical Models 
Lecture: Thu, Apr 13  All the above  Course concepts, the 2013 midterm, and finishing our junctiontree algorithm example 
Exam  
Thu, Apr 20  For study practice: 2013 midterm 