ST3SML-Statistical Data Science and Machine Learning

Module Provider: Mathematics and Statistics
Number of credits: 10 [5 ECTS credits]
Terms in which taught: Spring term module
Pre-requisites: ST1PS Probability and Statistics or ST2PS Probability and Statistics and MA1MSP Mathematical and Statistical Programming or MA2MPR Mathematical Programming or MA2NA1 Numerical Analysis I
Non-modular pre-requisites:
Modules excluded:
Current from: 2018/9

Module Convenor: Dr Richard Everitt


Type of module:

Summary module description:

The topics of Data Science, Machine Learning and Artificial Intelligence have recently become part of the public consciousness, in part due to their successful application in industry (most notably at large technology companies). Many of the most successful techniques used in these fields are underpinned by statistical techniques. This module begins by covering some of these underpinning techniques, and shows how they may be applied to problems in Data Science (for inference in implicit models) and Machine Learning (for classification).


This module aims to give students a solid understanding of the types of methods that are used in Machine Learning, and the ability to implement and use some of them. It also aims to connect students with research being conducted in this area.

Assessable learning outcomes:

By the end of the module it is expected that the student will be able to:

  • use and explain underpinning statistical methods for Data Science and Machine Learning;

  • produce software implementation of the methods taught in the module;

  • use approximate Bayesian computation and classification techniques to analyse data.

Additional outcomes:

The student will also gain experience of reading the scientific literature and learning about current research.

Outline content:

The module will begin with an introduction to Data Science, Machine Learning and Artificial Intelligence, then describe the ideas that underpin the statistical approach to these topics (maximum likelihood, Bayesian models and Bayesian inference using Monte Carlo). This leads to a topic in Data Science that has recently become an area of interest to the research community: the use of “approximate Bayesian computation” (ABC) for inference in “implicit” models (statistical models that are defined by a black box simulator). The module then switches attention to Machine Learning, covering the topics of regression and classification, including: linear and logistic regression; simple classifiers and neural networks.

Brief description of teaching and learning methods:

The core material will be delivered in 15 lectures. These will be supported by material from the book “Bayesian Reasoning and Machine Learning” that is freely available online at, along with some accessible sections of research articles, and blog posts. This range of sources will be used to give students exposure to the way a Data Scientist working in industry or academia would learn their subject. This will provide students who are interested in the area a path to explore the subject more widely, whilst being supported by being provided with an easy-to-follow path through the material.

There will be 5 practical PC lab sessions spread in between the lectures. Each will give the students the chance to learn to code up concepts covered in the lectures. The concepts will be covered initially in practical sessions, and simply treated as algorithms, in advance of being covered in the lectures, where the underpinning ideas will be explained. The aim of this is to give the students an understanding of the purpose of the methods before they encounter the mathematics.

There will be one assignment, handed out at the beginning of the module, and due in at the end. The assignment will consist of 5 different problems that one will need to use software implementations of the algorithms in the module in order to solve. Each of the 5 PC labs will cover a problem that is very close to one given in the assignment, in order to motivate students to attend the PC labs, and engage with the module as it is progressing. 

Additional support with programming will be offered where required.

Contact hours:
  Autumn Spring Summer
Lectures 15
Practicals classes and workshops 5
Guided independent study 80
Total hours by term 100.00
Total hours for module 100.00

Summative Assessment Methods:
Method Percentage
Written exam 70
Set exercise 30

Summative assessment- Examinations:

One exam, 2 hours

Summative assessment- Coursework and in-class tests:

One assignment, with questions that are related to content covered in practicals.

Formative assessment methods:

Feedback given during practicals.

Penalties for late submission:
The Module Convener will apply the following penalties for work submitted late:

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day[1] (or part thereof) following the deadline up to a total of five working days;
  • where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.

  • The University policy statement on penalties for late submission can be found at:
    You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

    Assessment requirements for a pass:

    A mark of 40% overall.

    Reassessment arrangements:

    One examination paper of 2 hours duration in August/September - the resit module mark will be the higher of the exam mark (100% exam) and the exam mark plus previous coursework marks (70% exam, 30% coursework).

    Additional Costs (specified where applicable):



    Required text books


    Specialist equipment or materials


    Specialist clothing, footwear or headgear


    Printing and binding


    Computers and devices with a particular specification


    Travel, accommodation and subsistence


    Last updated: 20 April 2018


    Things to do now