CIS667 : Advanced Data Mining


Course Code :  CIS 667          Course Name : Advanced Data Mining                  Level: Master
Credit Hours
: 3 Hrs

Course PrerequisitesGood Knowledge in Machine Learning.

 

Get the New Slide which Includes Students Presentations.

Course Overview:

The Course will cover the following materials:

a) fundamentals, data mining concepts and functions, data pre-processing, data reduction, mining association rules in large databases, classification and prediction techniques, clustering analysis algorithms, data mining languages, data mining applications and new trends.

b) Advanced Knowledge discovery in semi-structured/unstructured data repositories with emphasis on emerging computational intelligence paradigms such as soft computing and artificial life. Application will be visited in special themes: advanced transactional data mining, Web Mining, Text Mining, Bioinformatics, and other scientific and engineering applications.

 Text Book :

Data Mining: Concepts and Techniques, 1st or 2nd  Ed., Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2003 or 2006. ISBN 1-55860-901-6
Book Web site: http://www-faculty.cs.uiuc.edu/~hanj/bk2/index.html
 

Note : From this Website, students can download the Original Book Slides prepared by the Authors of the Book.

Course Outline                                          Get the PDF version of the Course Syllabus

Introduction                            Get Slides

1 What Motivated Data Mining? Why Is It Important?
2 So, What Is Data Mining?
3 Data Mining--On What Kind of Data?
4 Data Mining Functionalities—What Kinds of Patterns Can Be Mined?
5 Are All of the Patterns Interesting?
6 Classification of Data Mining Systems
7 Data Mining Task Primitives
8 Integration of a Data Mining System with a Database or Data Warehouse System
9 Major Issues in Data Mining
10 Data Mining Applications
11 Data Mining System Products and Research Prototypes
12 Social Impacts of Data Mining

Data Preprocessing                                 Get Slides                     Get Math Pages File

1 Why Preprocess the Data?
2 Descriptive Data Summarization
3 Data Cleaning
4 Data Integration and Transformation
5 Data Reduction
6 Data Discretization and Concept Hierarchy Generation 
7 Feature Selection Techniques

Mining Frequent Patterns and Associations         Get Slides    

1 Basic Concepts and a Road Map
2 Efficient and Scalable Frequent Item set Mining Methods
3 Mining Various Kinds of Association Rules
4 Using WEKA software for finding Association Rules

 Classification and Prediction         Get Slides

1 What Is Classification? What Is Prediction?
2 Issues Regarding Classification and Prediction
3 Classification by Decision Tree Induction                             Get More Slides
4 Bayesian Classification                                                   
Get  Slides
5 Rule-Based Classification                                           Get  Slides
6 Prediction
7 Accuracy and Error Measures
8 Evaluating the Accuracy of a Classifier or Predictor
9 Using WEKA software for data Classification
10 Using Oracle Data Mining    
Get  Slides           

Classification Using Lazy Learning Techniques        Get Slides 

1 Tasks of concept learning and classification 
2 Features of lazy learning 
3 Similarity measures 
4 Calculate and Explain values of similarity 
5 Formulate lazy learning tasks
6 Lazy learning algorithms : (Instance-based learning and kNN-learning)      
7 Apply the lazy learning algorithms to learning tasks, (Classification task)
8 Advantages and disadvantages of lazy learning algorithms

Classification using Soft-Computing     Get Slides       

1 Introduction to Soft Computing
2 Introduction to Rough Set Theory
3 Reduct Computation Techniques
4 Classification using Rough Set Theory  
          
5 Using Rosetta Tool for Reduct computation and data Classification           
6 Major Issues in Rough Set Theory for Data Mining
7 Fuzzy Set and Data Mining     
Get Slides

Cluster Analysis                                        Get Slides           Get More Slides

1 What Is Cluster Analysis?
2 Types of Data in Cluster Analysis
3 A Categorization of Major Clustering Methods

Mining Spatial, Multimedia, Text, and Web Data             Get Slides

1 Spatial Data Mining
2 Multimedia Data Mining
3 Text Mining                                                
Get Slides
4 Mining the World Wide Web                        
Get Slides

Applications and Trends in Data Mining            

1 Data Mining Applications
2 Data Mining System Products and Research Prototypes
3 Additional Themes on Data Mining
4 Social Impacts of Data Mining
5
Data Mining Methodologies                Get Slides

Data Warehouse and OLAP Technology: An Overview  Get Slides

1 What Is a Data Warehouse?
2 A Multidimensional Data Model
3 Data Warehouse Architecture
4 Data Warehouse Implementation
5 From Data Warehousing to Data Mining


Required Software

WEKA is a software for machine learning and data mining .  WEKA is an open source software issued under the GNU General Public License.
    Download the software from:
http://www.cs.waikato.ac.nz/ml/weka/ 

Rosetta is a software for data reduction and classification purposes based on the concepts of Rough Set Theory.
    Download the software from: http://rosetta.lcb.uu.se/general/

       See the Software page for other Recourses (Software and Datasets).

Exams and grading strategy:

            First  Exam :               20 Marks             
            Second  Exam :           20 Marks             
            Final Exam :                40 Marks              
            Assignments & Project:  20 Marks          

  

GOOD LUCK !