SASSAS | The Leader in Business Intelligence -- Superior Software that gives you The Power to Know
  NewsEventsConsultingCareersContact UsResource Center
Home Products and Solutions Customers Partners Company Customer Support
 
Products and Solutions
Industries
Solution Lines
Data Integration
Business Intelligence
Analytics
Statistics
Data & Text Mining
- Predictive Analytics/Data Mining
- Scoring Acceleration
- Text Mining
Data Visualization
Forecasting & Econometrics
Optimization
Model Mgmt. and Deployment
Quality Improvement
Enterprise Intelligence Platform
Government
Small to Medium Business
Product Index A-Z
 

Decision Trees with SAS® Enterprise MinerTM

Predicting future outcomes and identifying factors that can produce a desired effect are often the main goals of data analysis and data mining. Decision trees are one of the most popular methods of predictive modeling for data mining purposes because they provide interpretable rules and logic statements that enable more intelligent decision making.

A decision tree partitions data into smaller segments called terminal nodes or leaves that are homogeneous with respect to a target variable. Partitions are defined in terms of other variables called input variables, thereby defining a predictive relationship between the inputs and the target. This partitioning continues until the subsets cannot be partitioned any further using user-defined stopping criteria. By creating homogeneous groups, analysts can predict with greater certainty how individuals in each group will behave. For example, in database marketing, decision trees can be used to segment groups of customers and develop customer profiles to help marketers produce targeted promotions that achieve higher response rates.

Decision trees are just one of the advanced analysis models included in SAS Enterprise Miner, and they are often used during preliminary predictive modeling. Other Enterprise Miner modeling techniques such as neural networks, memory-based reasoning and regression should be used in conjunction with decision trees. For instance, you may want use decision trees for variable selection and then mine the reduced dimension data set with a more resource intensive neural network. It is also common to use decision trees to segment the data and then use another predictive modeling method to predict the response in each segment.

In addition to the advantages provided by the integrated environment of Enterprise Miner, SAS decision trees provide many unique features not found in other tree implementations. One differentiating feature is that the SAS Enterprise Miner Tree node lets users mix the algorithmic strategies advocated by Kass (CHAID), (1980), and by Breiman, Friedman, Olshen and Stone (1984), to match the specific needs of different situations.

SAS extends the Kass strategy with multi-way splits on interval variables and p-value adjustments for the number of variables and size of the tree. SAS extends the strategy of Breiman et al. with more ways to evaluate trees. For a categorical target value of special interest, the software will compute the proportion of cases with that value among a specified fraction of cases predicted most likely to contain that value. Using this measure in retrospective pruning results in trees of appropriate size for isolating rare events that are hard to predict accurately.

Features of the Enterprise Miner Tree node include:

  • Nominal, ordinal, and interval input (independent) and target (dependent) variables.
  • Variance reduction and F-test splitting criteria for interval targets.
  • Gini or entropy reduction (Information Gain) and CHAID criteria for categorical targets.
  • Binary splits, n-ary splits, and software-chosen number of branches (child nodes).
  • Missing values optionally regarded as a special, acceptable value for splitting rules.
  • Surrogate rules optionally found for use with missing values and variable importance.
  • Cost-complexity pruning and reduced-error pruning with validation (hold-out) data.
  • Prior probabilities optionally used in the split search or during pruning.
  • Misclassification cost matrix extends to include new decision alternatives.
  • Interactive training mode for defining splits and pruning nodes explicitly.
  • Variable importance computed separately with training and validation data.
  • Generation of SAS DATA step code with an indicator variable for each terminal node.

Features may be selected independently. For example, the software may use p-values to find multi-way splits and then retrospectively form a sequence of subtrees and prune with validation data -- thereby mixing the strategies of Kass and Breiman et al.

Desktop Applications Help Share Your Results

Once decision trees have been created, it is important to share the results with others in your organization. SAS Enterprise Miner offers two options for viewing decision trees, depending on which version of the software you use.

The Enterprise Miner Tree Desktop Application 9.0 and the Enterprise Miner Tree Results Viewer 4.1.2 are Microsoft Windows applications for interactively viewing decision trees created with the SAS Enterprise Miner. This means business users do not need Enterprise Miner installed on their PCs to view Enterprise Miner decision tree results.

Each application displays several tables and graphs in separate windows that may be independently arranged. Clicking on a variable, node or subtree in one window automatically updates and selects the corresponding items in the other windows. All views copy and print. The tree may print to a single page or across multiple pages with the same magnification displayed in the view.

The Tree Desktop Application 9.0 replaces the Tree Results Viewer 4.1.2. New Tree Desktop Application features include:

  • Cumulative gain and lift charts.
  • A new bar chart displaying bars with widths proportional to the leaf sizes.
  • Decision alternatives as optional columns in the classification matrix.

As with the Tree Results Viewer, the Desktop Application should be launched with a tree data set, created with Enterprise Miner. If launched without a tree data set, the Desktop Application enters a non-working SAS mode that attempts to connect to the SAS System. SAS mode is still being developed and is not intended for production use until SAS Release 9.1. Only the Results Viewer mode is supported.

More on this topic

News and Events
Brochure pdf
The Power to Know
   Contact Us      Worldwide Sites     Search     Site Map     RSS Feeds     Terms of Use    Privacy Statement   Copyright © 2008 SAS Institute Inc. All Rights Reserved