CS3DS19-Data Science Algorithms and Tools

Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Level:6
Terms in which taught: Spring term module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2020/1

Module Convenor: Dr Giuseppe Di Fatta

Email: g.difatta@reading.ac.uk

Type of module:

Summary module description:

Automated data collection and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. In this context, automated data analysis and data modelling tools and algorithms (Data Mining) are becoming essential components to any information system. Application areas of these techniques include scientific computing, intelligent business, direct marketing, customer relationship management, market segmentation, store shelf management, data warehouse management, fraud detection in e-commerce and in credit card transactions, etc.


Aims:

The study of fundamental techniques and tools for data manipulation and transformation, and for data mining algorithms classification, regression, clustering, association rule mining. In particular, one of the leading platform for Data Science and Machine Learning, KNIME, will be introduced and adopted for practical activities. 



This module also encourages students to develop a set of professional skills, such as problem solving, critical analysis of published literature, creativity, technical report writing for technical and non-technical audience, professional communication (email; letters; minutes etc.), self-reflection and effective use of commercial software.


Assessable learning outcomes:

Students are expected to understand the general Data Mining principles and techniques, and to be able to apply them in different contexts. In a practical project a data workflow is designed and developed using advanced tools for data science to combine data mining algorithms and analyse real-world datasets.


Additional outcomes:

Students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.


Outline content:


  • Introduction to Data Mining;

  • Introduction to Data Science and Machine Learning platforms

  • KNIME

  • Data preprocessing;

  • Proximity measures;

  • Regression, Classification and model evaluation;

  • Clustering and cluster validity;

  • Decision Tree Induction;

  • Association Rule Mining;


Brief description of teaching and learning methods:

The module comprises lectures and practical activities. The lectures introduce the basic concepts, the tools, and the algorithms used to build Data Science applications. The assessment is based on multiple choice questionnaires and a data science project that allows the students to apply theoretical concepts to a practical case.


Contact hours:
  Autumn Spring Summer
Lectures 20
Guided independent study: 80
       
Total hours by term 0 0
       
Total hours for module 100

Summative Assessment Methods:
Method Percentage
Written exam 50
Set exercise 50

Summative assessment- Examinations:

One 1.5-hour examination paper in May/June.


Summative assessment- Coursework and in-class tests:

The coursework consists of one element: 




  • Set exercise: A coursework assignment (50%)


Formative assessment methods:

Penalties for late submission:

The Module Convenor will apply the following penalties for work submitted late:

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day[1] (or part thereof) following the deadline up to a total of five working days;
  • where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.
The University policy statement on penalties for late submission can be found at: http://www.reading.ac.uk/web/FILES/qualitysupport/penaltiesforlatesubmission.pdf
You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

Assessment requirements for a pass:

A mark of 40% overall.


Reassessment arrangements:

One 2-hour examination paper in August/September.  Note that the resit module mark will be the higher of (a) the mark from this resit exam and (b) an average of this resit exam mark and previous coursework marks, weighted as per the first attempt (50% exam, 50% coursework). 


Additional Costs (specified where applicable):

Last updated: 16 April 2020

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.

Things to do now