CSMDM16-Data Analytics and Mining

Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Level:7
Terms in which taught: Autumn term module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2020/1

Module Convenor: Dr Giuseppe Di Fatta

Email: g.difatta@reading.ac.uk

Type of module:

Summary module description:

This module covers data analytics and data mining.


Aims:

Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. Automated data analytics and mining techniques are becoming essential components to any information system. In the Knowledge Discovery process large data sets have to be cleaned, pre-processed, selected, merged, etc., and finally processed for the automatic extraction of interesting knowledge, such as descriptive and predictive models. The techniques span from statistics to machine learning and information science.



This module focuses on concepts, methodologies, algorithms and tools for the design, management and deployment of the Knowledge Discovery process. In particular, tools for data analytics (R) and workflow management (KNIME) will be adopted for hands-on activities on several test cases. Students will learn general Data Mining principles and techniques and will apply them in different applicative domains.



This module also encourages students to develop a set of professional skills such as problem solving, critical analysis of published literature, creativity, technical report writing for technical and non-technical audiences, professional communication (email; letters; minutes etc.), self-reflection and effective use of commercial software.


Assessable learning outcomes:
Students are expected to understand the general Knowledge Discovery process, the various Data Mining algorithms and techniques and to be able to apply them in different contexts. During practical activities the students will adopt state-of-the-art tools and languages for implementing data analytics and mining solutions for different applicative domains.

Additional outcomes:
The students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.

Outline content:

• Introduction to the Knowledge Discovery process;



• Data selection, pre-processing and cleaning;



• Data mining algorithms (classification, clustering, etc.);



• Workflow management systems (KNIME);



• The R Project for Statistical Computing.


Brief description of teaching and learning methods:

The module comprises lectures (20 hours), practical sessions (10 hours) and a major coursework. The lectures introduce the basic concepts, algorithms and tools for Data Analytics and Mining. During the practical sessions tools for data analytics and workflow management will be adopted for hands-on activities on several test cases. A final project allows the students to apply the concepts to a practical case.



Recommended Text:



Introducti on to Data Mining Pang-Ning Tan, Michael Steinbach, Vipin Kumar Addison-Wesley ISBN-10: 0321420527, ISBN 13:9780321420527



Data Mining: Practical Machine Learning Tools and Techniques (Second Edition) Ian H Witten, Eibe Frank Morgan Kaufmann ISBN 0-12- 088407-0



Data Mining, Concepts and Techniques, Second Edition Jiawei Han, Micheline Kamber Morgan Kaufmann Publishers, March 2006 ISBN 978-1-55860-901-3 ISBN 10:1-55860-901-6


Contact hours:
  Autumn Spring Summer
Lectures 20
Practicals classes and workshops 10
Guided independent study: 70
       
Total hours by term 100
       
Total hours for module 100

Summative Assessment Methods:
Method Percentage
Written assignment including essay 100

Summative assessment- Examinations:
N/A

Summative assessment- Coursework and in-class tests:

One individual final project (100%)


Formative assessment methods:

Penalties for late submission:
Penalties for late submission on this module are in accordance with the University policy. Please refer to page 5 of the Postgraduate Guide to Assessment for further information: http://www.reading.ac.uk/internal/exams/student/exa-guidePG.aspx

Assessment requirements for a pass:

50% overall module mark.


Reassessment arrangements:

 One 2-hour examination paper in August/September.  


Additional Costs (specified where applicable):
1) Required text books:
2) Specialist equipment or materials:
3) Specialist clothing, footwear or headgear:
4) Printing and binding:
5) Computers and devices with a particular specification:
6) Travel, accommodation and subsistence:

Last updated: 16 April 2020

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.

Things to do now