University of Information Technology

Data Analysis and Management

Course Description

Data Mining algorithms and computational paradigms allow computers to find patterns and regularities in databases, perform prediction and forecasting, and generally improve their performance through interaction with data. It is currently regarded as the key element of a more general process called knowledge discovery that deals with extracting useful knowledge from raw data. The knowledge discovery process includes data selection, cleaning, coding, using different statistical and machine learning techniques, and visualization of the generated structures. Special emphasis will be given to the Machine Learning methods as they provide the real knowledge discovery tools. Important related technologies such as data warehousing and on-line analytical processing (OLAP) will be also discussed. This course introduces the overview of Big Data Analytics that include applications, market trends, the fundamental platforms, such as Hadoop, Spark, and other tools such as Linked Big Data. The course will introduce several data storage methods and how to upload, distribute, and process them. This will include HDFS, HBase, KV stores, document database, and graph database.

This course intends to make students to

  • Discuss the basic concepts and techniques of Data Mining.
  • Compare the various data mining tools and choose the correct techniques by analyzing the problems.
  • Demonstrate database systems and their underlying theories.
  • Identify the knowledge on Big Data Analytics to handle various real-world challenges.

Intended Learning Outcomes (ILO)

On completion of this course, students should be able to

  • Describe data management techniques to store data locally and in cloud infrastructures
  • Demonstrate statistical methods and visualization to quickly explore data
  • Compute statistics and computational analysis to make predictions based on data
  • Develop hands-on applications using example datasets
  • Develop analysis skills of using recent data mining software for solving practical problems
  • Implement the system using Big Data (NoSQL) tools

Text and References Books

TextBooks

  1. Data Mining: Concepts and Techniques by Jiawei Han, Micheline Kamber, and Jian Pei
  2. Data Mining Practical Machine Learning Tools and Techniques By Ian H. Witten and Eibe Frank
  3. Big Data Analytics: Emerging Business Intelligence and Analytics Trends for Today’s Businesses

References

  1. Introduction to data mining by Pang-Ning Tan, Michael Steinbach and Vipin Kumar
  2. Cloud Computing and Big data basics By Fujitsu ICT Lab

Assessment System

Evaluation Marks Percentage
Class Participation 10 Marks 10%
Tutorial/Project/Assignments/Discussion/Presentation 30 Marks 30%
Final Examination 60 Marks 60%