en_GB
Hold Ctrl-tasten nede. Trykk på + for å forstørre eller - for å forminske.

DAT550_1

Data mining

This is the study programme for 2019/2020. It is subject to change.


The purpose of this course is for students to gain knowledge and practical experience of data mining techniques. The lecture will prepare the students with a deep knowledge of technologies and be able to prepare large-scale data for data mining (pre-processing) and use a number of data mining methods to extract actionable knowledge. The course will provide the opportunity for students to learn state-of-the-art data mining algorithms and tools. The students will get hands-on experience to try these tools on real data.

Learning outcome

Knowledge:
  • Theory and practice of data preparation, selection and mining.
  • Concepts, methods, and techniques to gain insights from large-scale data.

Skills:
  • Frequent itemset mining, association rule mining, clustering, classification, graph and stream mining
  • Process and prepare large-scale data for various data mining tasks
  • Implement data mining pipelines, evaluate, and tune parameters for various data mining models using state-of-the-art tools

General competencies:
  • Identify the theoretical and practical issues behind various data mining techniques. Being able to list and describe strengths, limitations and trade-offs among various data mining techniques and choose the appropriate techniques for solving data science problems for various applications.

Contents

  • Data cleansing, transformation and preparation
  • Dimensionality reduction
  • Recommendation systems
  • Graph mining
  • Classification
  • Neural Networks and Deep learning
  • Clustering
  • Mining frequent patterns, associations and correlations
  • Mining data streams

Required prerequisite knowledge

None.

Recommended previous knowledge

DAT540 Introduction to data science, STA500 Probability and Statistics 2

Exam

Written exam and project report
Weight Duration Marks Aid
Written exam3/54 hoursA - F1)
Project report2/5 A - F
1) Textbooks and Lecture notes

Coursework requirements

Mandatory assignments
Three mandatory ungraded (Pass/Fail) exercises/programming assignments
All programming exercises must be passed to attend for the written exam and to get project approved. Completion of mandatory lab assignments are to be made at the times and in the groups that are assigned. Absence due to illness or for other reasons must be communicated as soon as possible to the laboratory personnel. One cannot expect that provisions for completion of the lab assignments at other times are made unless prior arrangements with the laboratory personnel have been agreed upon. Failure to complete the assigned labs on time or not having them approved will result in barring from taking the exam of the course.

Course teacher(s)

Course coordinator
Vinay Jayarama Setty
Head of Department
Tom Ryen

Method of work

4 hours lectures/exercises and 2 hours of guided programming exercises and project. Programming exercises requires additional non-guided work effort.

Overlapping courses

Course Reduction (SP)
Web Search and Data Mining (DAT630_1) 5

Open to

Admission to Single Courses at the Faculty of Science and Technology
Computer Science - Master's Degree Programme

Course assessment

Form and/or discussion.

Literature

  1. Data Mining: Practical Machine Learning Tools and Techniques, Third Editiion, by Ian H. Witten, Eibe Frank, Mark A. Hall
  2. An Introduction to Data Mining, 2nd edition, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Anuj Karpatne
  3. For labs: Python Data Science Handbook by Jake VanderPlas https://jakevdp.github.io/PythonDataScienceHandbook/ (free ebook available no need to buy)


This is the study programme for 2019/2020. It is subject to change.

Sist oppdatert: 12.11.2019

History