SAS® Factory Miner Features
Data access & preparation
- Access data sources registered in SAS Metadata Server, including SAS data sets, database tables and Hadoop files.
- Interactively assign data source metadata – such as variable roles, levels or order – or use automated settings to share variable settings across projects.
- Define segments in your data for stratified modeling.
- Assess data issues with automated data profiling and interactive variable distribution graphs.
- Filter your analytical input data:
- Winsorized.
- Standard deviations.
- Trimmed.
- Rare values.
- Apply transformations to your data for better models:
- Log.
- Log 10.
- Square root.
- Inverse.
- Square.
- Exponential.
- Centering.
- Standardized.
- Range.
- Bucket.
- Pseudo-quantile.
- Optimal binning.
- Principal component analysis.
- Clean your data with statistical and machine learning imputation methods:
- Mean.
- Min.
- Max.
- Median.
- Midrange.
- Constant.
- Count-based.
- Distribution-based.
Customizable supervised learning templates
- Interactively build custom templates that include models and the following processing steps:
- Filtering.
- Principal components.
- Imputation.
- Transformations.
- Supervised and unsupervised variable selection.
- Create your own model templates.
- Edit any data preparation or model parameters and save as customized template.
- Share model templates across projects and users.
Self-service machine learning techniques
- Build models using the following techniques:
- Bayesian networks.
- Decision trees.
- Gradient boosting.
- Neural networks.
- Random forests.
- Support vector machines.
- Generalized linear models.
- Linear regression.
- Logistic regression.
- Interactively view model-specific results.
Champion model identification
- Champion models are automatically selected for each segment using selectable criteria:
- Kolmogorov-Smirnov.
- Lift and cumulative lift.
- Gain and cumulative gain.
- Misclassification rate.
- Percent captured event.
- Average percent captured event.
- Average square error.
- Override system-selected models and manually identify your champion model.
- Interactively compare and assess models within a segment and across multiple segments.
Model exceptions identification
- View reports that highlight model performance exceptions.
- Easily identify and drill into details for underperforming models.
- Modify default settings for each model template.
Model tracking & reporting
- Generate summary reports that contain model results, significant variables and model settings.
- Share reports via PDFs and RTFs.
Model retraining
- Retrain existing model templates on new data sets.
- Track model-build assessment statistics across retraining iterations.
- Longitudinal model performance degradation reports.
- Automatically retrain models in batch using REST endpoints.
Flexible model management & deployment
- Automatically generate SAS score code for all model templates.
- Register models to SAS Model Manager for centralized model deployment and management (requires SAS Model Manager).
- Deploy models in database and in Hadoop using SAS Scoring Accelerator (requires SAS Scoring Accelerator).
Scalable processing
- Train models using multithreaded procedures on SAS servers to take advantage of multicore servers.
- Train models using asynchronous processes via SAS Grid Manager for workload balancing and scheduling (requires SAS Grid Manager).
- Train models in memory using SAS High-Performance Data Mining on database appliances (Oracle, Teradata, Greenplum, SAP HANA) or on Hadoop (requires SAS High-Performance Data Mining).