Worldwide Contacts

If you don't find your country in the list, see our worldwide contacts in:

Africa | Asia/Pacific | Europe | Latin America & Caribbean | Middle East | North America

Data Mining Techniques: Theory and Practice
 
Education
Certification
Customer Support
Premium Support
Performance Plan A
SAS Professionals
Education
Ski Goggle Promotion
Live Web Training
View All Courses
Roles
On-line Booking
Book By Location
Book in Ireland
Book by Course Title
Book by Date
Popular Courses
SAS E-Learning
Custom Events
European Courses
IDM Certificate
Partner Courses
Learning Needs Assessment
Partners
Contact Us
Online Publications
SAS Book Store
 

Data Mining Techniques: Theory and Practice

Role

Statistical Analyst

Duration

3 Days

Description

The course will be presented by Michael J. A. Berry or Gordon S. Linoff , co-founders of Data Miners, Inc. and co-author of Data Mining Techniques for Marketing, Sales and Customer Relationship Management and Mastering Data Mining

The course is based on the newly revised and expanded book, Data Mining Techniques for Marketing, Sales and Customer Relationship Management. This course is for business analysts and their managers, statisticians, and anyone who has a professional interest in data mining.

The course introduces a data mining methodology that is a superset of the SAS SEMMA methodology around which SAS Enterprise Miner is organized. This course also introduces a wide range of data mining algorithms and both theoretical knowledge and practical skills.

Prerequisites

No prior knowledge of statistical or data mining tools is required.

SAS Modules Used

This course uses Version 5.2 of SAS Enterprise Miner

Course Topics

Introduction to Data Mining

  • what is data mining?
  • directed and undirected data mining
  • models
  • profiling and prediction

Data Mining Methodology

  • why have a methodology?
  • how data miners can inadvertently learn things that are not true
  • translating business problems into data mining problems
  • the importance of model stability
  • finding the right input variables
  • sampling to create balanced model sets
  • partitioning to create training, validation, and test sets
  • data preparation
  • model assessment

Data Exploration

  • developing intuition about data
  • data structure
  • data types
  • data values
  • exploring distributions
  • summary statistics
  • histograms
  • using SAS Enterprise Miner for data exploration

Statistics and Regression

  • the null hypothesis
  • statistical significance
  • confidence bounds
  • variance and standard deviation
  • standardized values
  • correlation
  • linear regression
  • logistic regression
  • using SAS Enterprise Miner to build regression models

Decision Trees

  • decision trees as data exploration and classification tools
  • decision trees for modeling and scoring
  • decision trees for variable selection
  • alternate representations of decision trees
  • algorithms used to build decision trees
  • splitting criteria
  • recognizing instability and over fitting in decision tree models
  • capturing interactions between variables
  • using SAS Enterprise Miner to build decision trees

Neural Networks

  • origins of neural networks
  • neural networks compared with regression
  • the algorithms used to train neural networks
  • data preparation requirements for neural networks
  • picking appropriate inputs for neural networks
  • creating neural network models using SAS Enterprise Miner

Memory Based Reasoning

  • similarity and distance
  • distance metrics appropriate for different kinds of data
  • the role of the training set in MBR
  • combining the votes of several neighbors
  • other K-nearest neighbor techniques
  • collaborative filtering
  • using the SAS Enterprise Miner MBR node

Clustering

  • more on similarity and distance
  • the K-means algorithm
  • divisive clustering
  • agglomerative clustering
  • data preparation for clustering
  • interpreting clusters
  • finding clusters with SAS Enterprise Miner

Survival Analysis

  • origins of survival analysis
  • how business data is different from clinical data
  • hazards and hazard charts
  • retention curves and survival curves
  • calculating survival from retention
  • calculating hazards empirically
  • parametric hazard models
  • censoring
  • competing risks
  • survival based forecasting
  • using SAS code in SAS Enterprise Miner to create survival curves

Miscellaneous Techniques

  • link analysis
  • genetic algorithms
  • association rules
  • using SAS Enterprise Miner to discover associations in retail data

Putting Data Mining Techniques to Work

  • formulating the business problem as a data mining problem
  • finding the tool that fits the problem

Book Your Place Today

0845 402 9902

Terms & Conditions
Public Courses
Custom Training Courses