SAS® Factory Miner Features

Data access & preparation

  • Access data sources registered in SAS Metadata Server, including SAS data sets, database tables and Hadoop files.
  • Interactively assign data source metadata – such as variable roles, levels or order – or use automated settings to share variable settings across projects.
  • Define segments in your data for stratified modeling.
  • Assess data issues with automated data profiling and interactive variable distribution graphs.
  • Filter your analytical input data:
    • Winsorized.
    • Standard deviations.
    • Trimmed.
    • Rare values.
  • Apply transformations to your data for better models:
    • Log.
    • Log 10.
    • Square root.
    • Inverse.
    • Square.
    • Exponential.
    • Centering.
    • Standardized.
    • Range.
    • Bucket.
    • Pseudo-quantile.
    • Optimal binning.
    • Principal component analysis.
  • Clean your data with statistical and machine learning imputation methods:
    • Mean.
    • Min.
    • Max.
    • Median.
    • Midrange.
    • Constant.
    • Count-based.
    • Distribution-based.

Customizable supervised learning templates

  • Interactively build custom templates that include models and the following processing steps:
    • Filtering.
    • Principal components.
    • Imputation.
    • Transformations.
    • Supervised and unsupervised variable selection.
  • Create your own model templates.
  • Edit any data preparation or model parameters and save as customized template.
  • Share model templates across projects and users.

Self-service machine learning techniques

  • Build models using the following techniques:
    • Bayesian networks.
    • Decision trees.
    • Gradient boosting.
    • Neural networks.
    • Random forests.
    • Support vector machines.
    • Generalized linear models.
    • Linear regression.
    • Logistic regression.
  • Interactively view model-specific results.

Champion model identification

  • Champion models are automatically selected for each segment using selectable criteria:
    • Kolmogorov-Smirnov.
    • Lift and cumulative lift.
    • Gain and cumulative gain.
    • Misclassification rate.
    • Percent captured event.
    • Average percent captured event.
    • Average square error.
  • Override system-selected models and manually identify your champion model.
  • Interactively compare and assess models within a segment and across multiple segments.

Model exceptions identification

  • View reports that highlight model performance exceptions.
  • Easily identify and drill into details for underperforming models.
  • Modify default settings for each model template.

Model tracking & reporting

  • Generate summary reports that contain model results, significant variables and model settings.
  • Share reports via PDFs and RTFs.

Model retraining

  • Retrain existing model templates on new data sets.
  • Track model-build assessment statistics across retraining iterations.
  • Longitudinal model performance degradation reports.
  • Automatically retrain models in batch using REST endpoints.

Flexible model management & deployment

  • Automatically generate SAS score code for all model templates.
  • Register models to SAS Model Manager for centralized model deployment and management (requires SAS Model Manager).
  • Deploy models in database and in Hadoop using SAS Scoring Accelerator (requires SAS Scoring Accelerator).

Scalable processing

  • Train models using multithreaded procedures on SAS servers to take advantage of multicore servers.
  • Train models using asynchronous processes via SAS Grid Manager for workload balancing and scheduling (requires SAS Grid Manager).
  • Train models in memory using SAS High-Performance Data Mining on database appliances (Oracle, Teradata, Greenplum, SAP HANA) or on Hadoop (requires SAS High-Performance Data Mining).

Back to Top