CS3DS19-Data Science Algorithms and Tools

Module Provider: School of Mathematical, Physical and Computational Sciences
Number of credits: 10 [5 ECTS credits]
Terms in which taught: Spring term module
Non-modular pre-requisites:
Modules excluded:
Current from: 2019/0

Module Convenor: Dr Giuseppe Di Fatta

Email: g.difatta@reading.ac.uk

Type of module:

Summary module description:

Automated data collection and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. In this context, automated data analysis and data modelling tools and algorithms (Data Mining) are becoming essential components to any information system. Application areas of these techniques include scientific computing, intelligent business, direct marketing, customer relationship management, market segmentation, store shelf management, data warehouse management, fraud detection in e-commerce and in credit card transactions, etc.


The study of fundamental techniques and tools for data manipulation and transformation, and for data mining algorithms classification, regression, clustering, association rule mining. In particular, one of the leading platform for Data Science and Machine Learning, KNIME, will be introduced and adopted for practical activities. We will also collaborate with KNIME for embedding Level-1 and Level-2 industrial certification. 

This module also encourages students to develop a set of professional skills, such as problem solving, critical analysis of published literature, creativity, technical report writing for technical and non-technical audience, professional communication (email; letters; minutes etc.), self-reflection and effective use of commercial software.

Assessable learning outcomes:

Students are expected to understand the general Data Mining principles and techniques, and to be able to apply them in different contexts. In a practical project a data workflow is designed and developed using advanced tools for data science to combine data mining algorithms and analyse real-world datasets.

Additional outcomes:

Students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.

During this module the students are also offered the opportunity to gain two levels of the KNIME Certification in Data Science.

Outline content:

  • Introduction to Data Mining;

  • Introduction to Data Science and Machine Learning platforms


  • Data preprocessing;

  • Proximity measures;

  • Regression, Classification and model evaluation;

  • Clustering and cluster validity;

  • Decision Tree Induction;

  • Association Rule Mining;

Brief description of teaching and learning methods:

The module comprises 2 hours of lectures and 2 hours of practical activities per week. The lectures introduce the basic concepts, the tools, and the algorithms used to build Data Science applications. The assessment is based on multiple choice questionnaires and a data science project that allows the students to apply theoretical concepts to a practical case.

Contact hours:
  Autumn Spring Summer
Lectures 20
Practicals classes and workshops 16
Guided independent study: 64
Total hours by term 100
Total hours for module 100

Summative Assessment Methods:
Method Percentage
Written exam 50
Set exercise 40
Class test administered by School 10

Summative assessment- Examinations:

One examination paper of 90 mins.

Summative assessment- Coursework and in-class tests:

  • In-class test: A test based on a multiple choice questionnaire (10% of credits).

: this test has been designed to be valid to achieve the KNIME Certification in Data Science Level 1.

  • Set exercise: A coursework assignment (40% of credits): part of the coursework has been designed to be valid to achieve the KNIME Certification in Data Science Level 2.

Formative assessment methods:

In-class test: A test based on a multiple choice questionnaire: this test has been designed to be valid to achieve the KNIME Certification in Data Science Level 2.

Penalties for late submission:
The Module Convener will apply the following penalties for work submitted late:

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day[1] (or part thereof) following the deadline up to a total of five working days;
  • where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.

  • The University policy statement on penalties for late submission can be found at: http://www.reading.ac.uk/web/FILES/qualitysupport/penaltiesforlatesubmission.pdf
    You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

    Assessment requirements for a pass:

    A mark of 40% overall.

    Reassessment arrangements:

    One examination paper of 90 mins duration in August/September - the resit module mark will be

    the higher of the exam mark (100% exam) and the exam mark plus previous coursework marks

    (50% exam, 50% coursework including in-class test).

    Additional Costs (specified where applicable):

    Last updated: 8 April 2019


    Things to do now