CSMDM16-Data Analytics and Mining

Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Terms in which taught: Autumn term module
Non-modular pre-requisites:
Modules excluded: SEMDM13 Data Analytics and Mining
Module version for: 2017/8

Module Convenor: Dr Giuseppe Di Fatta

Summary module description:

This module covers data analytics and data mining.

Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. Automated data analytics and mining techniques are becoming essential components to any information system. In the Knowledge Discovery process large data sets have to be cleaned, pre-processed, selected, merged, etc., and finally processed for the automatic extraction of interesting knowledge, such as descriptive and predictive models. The techniques span from statistics to machine learning and information science.

This module focuses on concepts, methodologies, algorithms and tools for the design, management and deployment of the Knowledge Discovery process. In particular, tools for data analytics (R) and workflow management (KNIME) will be adopted for hands-on activities on several test cases. Student will learn general Data Mining principles and techniques and will apply them in different applicative domains.

Assessable learning outcomes:
Students are expected to understand the general Knowledge Discovery process, the various Data Mining algorithms and techniques and to be able to apply them in different contexts. During practical activities the students will adopt state-of-the-art tools and languages for implementing data analytics and mining solutions for different applicative domains.

Additional outcomes:
The students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.

Outline content:

• Introduction to the Knowledge Discovery process;

• Data selection, pre-processing and cleaning;

• Data mining algorithms (classification, clustering, etc.);

• Workflow management systems (KNIME);

• The R Project for Statistical Computing.

Brief description of teaching and learning methods:

The module comprises lectures (20 hours), practical sessions (10 hours) and a major coursework. The lectures introduce the basic concepts, algorithms and tools for Data Analytics and Mining. During the practical sessions tools for data analytics and workflow management will be adopted for hands-on activities on several test cases. A final project allows the students to apply the concepts to a practical case.

Contact hours:
  Autumn Spring Summer
Lectures 20
Practicals classes and workshops 10
Guided independent study 70
Total hours by term 100.00
Total hours for module 100.00

Summative Assessment Methods:
Method Percentage
Written assignment including essay 100

Other information on summative assessment:
Final project (100%)

Formative assessment methods:

Penalties for late submission:
Penalties for late submission on this module are in accordance with the University policy. Please refer to page 5 of the Postgraduate Guide to Assessment for further information: http://www.reading.ac.uk/internal/exams/student/exa-guidePG.aspx

Length of examination:

Requirements for a pass:
50% overall module mark

Reassessment arrangements:
Resit by examination

