University of Toronto

ECE521: Inference Algorithms and Machine Learning


Instructors: Jimmy Ba and Mark Ebden

Section1: M 10 – 11 (BA1130) TH 9 – 11 (BA1130)

Section2: M 11 – 12 (RS211)   TH 12 – 14 (LM161)

Office hours: Jimmy: M 11-12 in BA4161   Mark: TH 15-16 in SS6026C

Tutorials: F 10 – 12 (BA2155)   M 12 – 14 (BA2155)   F 16 – 18 (BA2155)   F 09 – 11 (GB248) (merged TUT0104 and TUT0105)

Instructors won’t stick strictly to teaching a given section. For example, on Thursday 19 January Mark Ebden will teach both sections (1 and 2), and in future sometimes Jimmy Ba will teach both. This will occur regularly.

Course Piazza and Course Syllabus

Note that the time of the midterm has changed from what the syllabus initially said. It is now to begin in the evening but still lasting about 90 minutes.


  • Apr 18, 6:45pm: We will hold the following office hours to help preparing for the final exam. The final exam is scheduled for Thursday April 20th 9:30am- 12:00 noon

  • Apr 16, 4:29pm: We will hold the following office hours to help preparing for the final exam. The final exam is scheduled for Thursday April 20th 9:30am- 12:00 noon
    • Monday April 17th, BA4161
    • 12:00- 1:00 pm: Kaustav, Tony
    • 3:00-4:00 pm: Jimmy
    • Tuesday April 18th, BA4161
    • 12:00- 1:00 pm: Renjie, Eleni
    • Wednesday April 19th, BA4161
    • 12:00- 1:00 pm: Shenlong, Renjie
  • Mar 27, 1:34pm: Edit: The assignment 4 handout is posted on the course website. The due date is April 9th midnight, 2017. You have two weeks to complete this assignment. There are a few practice problems included in the A4 handout that will not be graded. You do not need to include the solutions for the practice questions in your final report. These practice problems may be helpful for preparing the final exam on the 20th of April.

  • Mar 8, 12:45pm: The assignment 3 pdf handout and the related dataset is out on the course website. The due date is Mar 24th midnight, 2017. You have more than two weeks to complete this assignment.

  • Feb 13, 3:27pm: Skule has implemented a new on-going annoynemous feedback system SpeakUp!. Instead of filling out the monthly early course evaluation in class, you can also submit your feedbacks on a rolling basis through the SpeakUp! webpage and your class rep will contact the course instructors to discuss the issues.

  • Feb 10, 1:18am: The midterm questions from 2016 is posted. And you can find the midterm cheatsheet template on this page that you may enter information on both sides of the aid sheet, without restriction. The cheatsheet should be printed on 8.5″ x 11″ paper.

  • Feb 9, 11:37pm: The assignment 2 pdf handout and the related dataset is out on the course website. The due date is Feb 27th midnight, 2017. You have more than two weeks to complete this assignment.

  • Feb 6, 3:00pm: Typo correction for the bonus question 1.4.1 of the assignment 1, the \lambda should be set to 100 instead when you report the simulation results. You should see a smooth prediction from your Gaussian process regression model.

  • Feb 6, 9:27am: We are trying to recruit a volunteer note-taker for the ECE521H1 section 0101 for students registered with Accessibility Service. Email if you have questions or require any assistance. Thanks.

  • Jan 25, 2:29pm: I apologize for the multiple delays and the assignment 1 handout is finally out. We have spent quite sometime designing the exercises in the assignment and we hope you may find it rewarding to solve them. The pdf handout and the related dataset can be downloaded from the calendar section at this course website. The due date is Feb 7 midnight, 2017. You have two weeks to complete this assignment. There is a bonus part in the assignment that is roughly equal to 30% of assignment 1. The bonus marks can be used towards the final grade of this course.

  • Jan 16, 6:18pm: The Friday morning 9-11 tutorial sections TUT0104 and TUT0105 are now merged and will be held in GB248 from now on.

  • Jan 13, 1:20am: The first tutorials start today. There is a new tutorial section TUT0105 for the students who could not find tutorial space before. The TUT0105 tutorial time this week is Fri 9-11 am in BA1240. The tutorial room may change in the following weeks. Stay tuned. The first week lecture slides are also posted below.

  • Jan 09, 8:23am: Welcome to ECE521! Please take a moment to enroll the course Piazza. Piazza will be the main communication channel to contact the instructors for this course. The first tutorials start on Friday.

Course Overview:

The twenty-first century has seen a series of breakthroughs in statistical machine learning and inference algorithms that allow us to solve many of the most challenging scientific and engineering problems in artificial intelligence, self-driving vehicles, robotics and DNA sequence analysis. In the past few years, machine learning applications in search engines, wearable devices and social networks have broadly impacted our daily life. These algorithms adapt to the data at hand and are tolerant to noisy observations. The goal of this course is to provide principled mathematical tools to solve statistical inference problems you may encounter later. The first half of the course covers the fundamentals of statistical machine learning and supervised learning models. The second half of the course focuses on probabilistic inference and unsupervised learning. The examples of the course include object recognition; image search, document retrieval; sequence filtering and alignment; and data compression. This course reviews state-of-the-art algorithms and models for probabilistic inference and machine learning.

Teaching Assistants:

Min Bai

Sinisa Colic

Kaustav Kundu

Renjie Liao

Mengye Ren

Eleni Triantafillou

Shenlong Wang

Yuhuai Wu

Tianrui Xiao

Haiyan Xu

TA Contact: Please post your questions related to tutorials, assignments and exams on Piazza. Do NOT send emails about the class to the personal emails of the TAs directly. We will not answer.


  Optional reading Topic
Week 1    
Lecture: Monday, Jan 09   Introduction
pdf slides
Lecture: Thursday, Jan 12   Review of probability
Fundamentals of machine learning
pdf slides
Tutorial   Review of linear algebra
Introduction to TensorFlow
pdf slides
Tutorial examples as an IPython Notebook.
Take a look at here on how to install Jupyter/IPython Notebook.
Week 2    
Lecture: Monday, Jan 16 The curse of dimensionality: Bishop 2006, Chap. 1.4
K-NN: Bishop 2006, Chap. 2.5.2
(free) K-NN and linear regression: Hastie et al 2013, Chap. 2.3
(free) Convex function and Jensen inequality: MacKay 2003, Chap. 2.7
(free) Gradient descent: Goodfellow et al 2016, Chap. 4.3
Example: K Nearest Neighbours
pdf slides
Lecture: Thursday, Jan 19 Stochastic gradient descent, Léon Bottou
The momentum method: Coursera video: Neural Networks for Machine Learning Lecture 6.3
Maximum likelihood for a Gaussian: MacKay 2003, Chap. 22.1
Maximum likelihhod estimation of a classifier: Hastie et al 2013, Chap. 2.6.3
Regularization: Goodfellow et al 2016, Chap. 7.1
Regularization through data augmentation: Goodfellow et al 2016, Chap. 7.4
Maximum likelihood estimation (MLE)]
Optimization and regularization
pdf slides
Tutorial   Tricks to improve SGD
“Tuning/debugging” optimizer
Multivariate Gaussian
Underfitting vs. overfitting
pdf slides
Week 3    
Lecture: Monday, Jan 23   Probabilistic interpretation of linear regression
Optimal regressor
[pdf slides] (see the joint slide deck from Jan 26)
Assignment 1: Wednesday, Jan 25 Due date: Feb 7 midnight, 2017
k-NN, Gaussian process (bonus), linear regression
Assignment handout
Download Tiny MNIST dataset here
Histogram of results
Lecture: Thursday, Jan 26 Regression and decision theory: Bishop 2006, Chap. 1.5
Bias-variance trade-off: Bishop 2006, Chap. 3.2
Optimal regressor
Feature expansion
Decision theory
joint pdf slides
Tutorial   k-NN, Linear regression
Gaussian process regression
Training, validation and test set
pdf slides
Week 4    
Lecture: Monday, Jan 30   Recap decision theory
Logistic regression
Neural networks
pdf slides
Lecture: Thursday, Feb 2 Neural networks: Bishop 2006, Chap. 5, MacKay 2003, Chap. 39-40, 44, Hastie et al 2013, Chap. 11
Neural networks
pdf slides
Tutorial   Logistic regression
Backpropagation examples
pdf slides
Week 5    
Lecture: Monday, Feb 6   Multi-class classification
Learning feedforward neural networks
pdf slides
Lecture: Thursday, Feb 9 Convolutional neural networks: cs231n course slides
Transfer learning and fine-tuning: cs231n course slides
Bag-of-tricks for deep neural networks
Types of neural networks: convolutional neural networks, recurrent neural networks
pdf slides
Assignment 2: Thursday, Feb 9 Due date: Feb 27 midnight, 2017
logistic regression, neural networks
Assignment handout (updated Feb 18th)
Download notMNIST dataset here
Histogram of results
Tutorial   Sample midterm review
Assignment 1 post-mortem
Week 6    
Lecture: Mon, Feb 13 Bishop 9.1 and 12.1          k-means clustering, dimensionality reduction
Study: Thu, Feb 16             Study independently in the classroom, with instructor on hand for questions. Unstructured.
Midterm: Thursday, Feb 16 Time: 6:20-7:50 pm. Sample midterm from 2016
Midterm cheatsheet template
Histogram of results
Tutorial   Midterm exam post-mortem
Week 7    
Lecture: Mon, Feb 27 Bishop 3.3
Murphy 2012: parts of chap. 5 & sec. 7.6
PCA continued, Bayesian methods
Lecture: Thu, Mar 2 Bishop 1.2.6 (Bayesian prediction), 1.3 (model selection), 2.4.2 (conjugate prior) Bayesian learning continued
Tutorial   Examples of PCA, k-Means
Bayesian predictive distribution
Bayesian model comparison
pdf slides
Week 8    
Lecture: Mon, Mar 6 Mixture of Gaussians: Bishop 9.2  
EM algorithm: Bishop 9.3  
Mixture models, EM algorithm
Assignment 3: Wed, Mar 8 Due date: March 24 midnight, 2017
Unsupervised learning, probablistic models
Assignment handout (updated Mar 13th)
Download the datasets: data2D, data100D, tinymnist
Download the utility function here
Lecture: Thu, Mar 9 Naive Bayes: Hastie et al 2013, Chap. 6.6.3
Bayesian network: Bishop 8.1, 8.2  
Mixture of Gaussians, Naive Bayes and Bayesian Networks
Tutorial   Introducing A3
Examples of Mixture of Bernoullis
EM algorithm
pdf slides
Week 9    
Lecture: Mon, Mar 13 Bishop 8.1 & 8.2
Parts of Murphy Ch. 10
Russell and Norvig 2009 (AI: A Modern Approach) parts of Ch. 14  
Bayesian networks continued
Lecture: Thu, Mar 16  Bishop 8.3, 8.4.3   Markov Random Fields, factor graphs
Tutorial   Review of graphical models
Conversion between BN, MRF and FG
Inference in graphical models
pdf slides
Week 10    
Lecture: Mon, Mar 20 Russell & Norvig 15.1
Parts of Bishop Chap. 13  
Sequence models
Lecture: Thu, Mar 23 Parts of Russell & Norvig 15.3
Parts of Bishop Chap. 13 
Hidden Markov Models (HMMs)
Tutorial   Review of Markov models
Examples of inference in graphical models
pdf slides
Week 11    
Lecture: Mon, Mar 27 Murphy 17.4
End of Russell & Norvig 15.2
Bishop 13.2.5  
HMM inference/learning
Assignment 4: Mon, Mar 27 Due date: April 9th midnight, 2017
Graphical models, sum-product algorithm
Assignment handout (updated)
Lecture: Thu, Mar 30 Parts of MacKay Chapter 16 and Sections 26.1-26.2
Bishop 8.4.4
Kschischang, Frey and Loeliger: Factor Graphs and the Sum-Product Algorithm Section 2
Frey: Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models Section 2 
Message-passing algorithms (updated notation)
Tutorial   Forward-backward algorithm
The sum-product algorithm
pdf slides
Week 12    
Lecture: Mon, Apr 3 Murphy 17.4.4 & 20.2
Bishop 8.4.5 & 13.2.5
MacKay 26.3
Max-sum algorithm
Lecture: Thu, Apr 6 MacKay Chapters 16 and 26
Bishop 8.4.7
LBP: MacKay 26.4, Bishop 8.4.7
Junction-tree algorithm, Loopy belief propagation
Tutorial   Review
pdf slides
Week 13    
Lecture: Mon, Apr 10 The first section of Murphy 19.6   Supervised Learning using Graphical Models
Discriminative Approach
Conditional Random Fields (CRFs)
Combining Deep Learning with Graphical Models
Lecture: Thu, Apr 13 All the above Course concepts, the 2013 midterm, and finishing our junction-tree algorithm example
Thu, Apr 20   For study practice: 2013 midterm