### Overview

Statistical Machine Learning plays a key role in science and technology. Some of the basic questions raised are:

- What is a good model for the available data?
- How can we fit the parameters of the model to the available data?
- How will a model perform on data which has yet to be observed?

This course provides a broad but thorough intermediate level study of the methods and practices of statistical machine learning, emphasising the mathematical, statistical, and computational aspects. Students will learn how to implement efficient machine learning algorithms on a computer based on principled mathematical foundations. Topics covered will include Bayesian inference and maximum likelihood modelling; regression, classification, density estimation, clustering, principal and independent component analysis; parametric, semi-parametric, and non-parametric models; basis functions, neural networks, kernel methods, and graphical models; deterministic and stochastic optimisation; overfitting, regularisation, and validation.

The course will use Python 3 and Jupyter notebook for all tutorials, and assignment/exam questions involving programming.

### Course Schedule

## Course Staff

Lecturer: Lexing Xie

Tutors:

Alexander Soen | Chamin Hewa Koneputugodage | Shidi Li |

Tianyu Wang | Josh Nguyen | Minchao Wu |

Ekaterina (Katya) Nikonova | Ruiqi Li | Haiqing Zhu |

Belona Sonna | Dillon Chen | Rong Wang |

Barclay Zhang | Evan Markou | Zhiyuan Wu |

### Textbook

Required: **Christopher M. Bishop: Pattern Recognition and Machine, Springer, 2006** (selected parts), available here

We also recommend:

- Deisenroth, Faisal, and Ong, Mathematics for Machine Learning. Cambridge University Press.
- Moritz Hardt and Benjamin Recht, Patterns, Predictions and Actions: A story about machine learning
- MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press
- Murphy, Probabilistic Machine Learning: An Introduction, MIT Press, 2021

### Course sites

Piazza will be used for all course discussions.

Signup at http://piazza.com/anu.edu.au/spring2022/comp4670comp8600 with access code “logistic_regression”Microsoft teams (ANU edition) will be used to hold lectures and labs/tutorials each week. The link to SML-2021 Team is here, use code “87v89zy” to join.

Gradescope will be used to manage assignment submissions and give feedback.

- Register for Gradescope at http://gradescope.com using your ANU e-mail and include your student ID number with sign-up. Use the entry code 3YNNKW to sign up for this course.
- A detailed guide about how to submit your assignemnt to Gradescope is here https://help.gradescope.com/article/ccbpppziu9-student-submit-work

Wattle will be used to host the final exam.

### Assessments

- Quiz 1, 2.5% (due week 4)
- Quiz 2, 2.5% (due week 8)
- Assignment 1, 20% (due Mon 12:00noon week 6, Canberra time)
- Assignment 2, 20% (due Mon 12:00noon week 11, Canberra time)
- Video assignment, 20% (due Tue 23:59 week 12, Canberra time)
- Final exam, 35% (online via wattle, during examination period)

#### Online quiz expectations

- Quiz will be conducted on wattle, and automatically graded.
- Students can attempt the quiz once, with no time limit.
- Open book – students are expected to complete the quiz by themselves, and are free to consult the textbook, notes, or relevant internet resources.
- The quiz will be redeemable with final exam, i.e. score for each quiz is calculated as
*Qx’ = max(Qx, Final)*, where*Qx*is the raw quiz score for Quiz 1 (*Q1*) or Quiz 2 (*Q2*), out of 100.*Final*is the score for the final exam, out of 100. - There will be NO late period for either quiz. Special consideration requests will also NOT be accepted, due to the rapid feedback cycle and redeemable nature of the quizzes.

#### Paired assignment expectations

- Students can submit Assignment 1 and Assignment 2 in a self-formed team of size 1 or 2 people.
- Students submitting in a pair act as one unit:
- Both of the two students should fully understand all the answers in their submission
- Each student in the pair must understand the solution well enough in order to reconstruct it by him/herself

- In the case of team submission, the assignment should include a brief statement about who contributed what.
- In most cases, students in the same team can expect to get the same mark for the team assignment

#### Video assignment

The video assignment is an individual assignment.

- Each student are expected to upload a video talking about one topic from the assignments or labs, and the thinking behind it.
- The length of the video should be between 4 to 8 minutes, with an under- and over- length penalty being 1 point per 10 seconds (or part thereof).
- Grading scheme for the video assignment will be made available in advance of the due date.

#### Late policy

This policy applies to Assignment 1, Assignment 2, and the video assignment.

Assignment submission that are late from 1 min to 24 hours attract a 5% penalty (of possible marks available).

Submissions late by more than 24 hours will not be accepted.

### Enrolment questions

*To enrol in this course you must have completed the pre-requisites as per the COMP4670 or COMP8600 course description.*

The topics covered in this course have some overlap with a number of courses in the major for Statistical Data Analytics. Please have a look at the first few tutorial sheets for an indication of the kinds of mathematics and statistics that we will build upon.

If ISIS does not let you enroll but you believe you should be able to (e.g. have taken equivalent courses as the pre-qreq in the different university), then submit a permission code application here.