Data Mining

Unearth trends and explore future outcomes

What is data mining?

Data mining is the practice of analyzing data to find hidden connections or correlations. In fact, some of today’s greatest explorers aren't charting the stars or visiting far-flung locales. Instead, data mining is helping map the future by looking at the glut of available information - and how that data can inform a smarter, more effective organization.

How? Through the use of data mining and predictive analytics to discover previously unknown patterns. Find enough of these unexpected connections, and you can see that the data is telling you a different story about the current state of your organization and what to expect from the future.

Data mining is a critical method for dealing with bigger and more complex data. Data mining can start finding patterns in a sample set of data and then look for the same pattern across a much larger universe of data. A final step applies predictive modeling to forecast outcomes.

Common data mining techniques include:

  • Descriptive modeling. Descriptive models classify elements into groups based on patterns and relationships. These models are used to support the development of predictive models.
  • Predictive modeling. Predictive models analyze patterns and past performance to predict the probability of a desired outcome. Examples of applied predictive modeling include ratemaking and credit scoring.
Analytics Insights

Analytics Insights

Connect with the latest insights on analytics through related articles and research.

More on data mining

Data mining in action

Learn how you can use data mining to identify trends, patterns and relationships, while predictive analytics can be used to predict future outcomes.

 

Data mining methods can be applied to a variety of issues in any industry. Many use cases center on analyzing the data available on customers and prospective customers to maximize sales, marketing and support opportunities, including:

  • Profiling and segmentation – predicts customer behaviors and needs by segment.
  • Cross-sell and up-sell – predicts what customers want to buy – and what products to recommend.
  • Acquisition and retention – uncovers customer preferences and patterns
  • Campaign management – predicts the success of customer campaigns.
  • Profitability and lifetime value – discovers the drivers for future value and how to appeal to customers in the future.

Outside of customer acquisition and retention, organizations use data mining to facilitate a variety of business objectives. Here are a few of the ways that different industries can apply data mining to improve some critical business decisions.

Application

What is Predicted

Resulting Business Decision

Credit scoring (banking)
Creditworthiness of new and existing sets of customers
How to assess and control risk within existing (or new) consumer portfolios.
Asset maintenance (utilities, manufacturing, oil and gas) The real drivers of asset and equipment failure. How to minimize operational disruptions and maintenance costs.
Health and condition management (health insurance) Patients at risk of chronic, treatable/preventable illness How to reduce healthcare costs and satisfy patients.
Fraud management (government, insurance, banks) Unknown fraud cases and future risks. How to decrease fraud losses and lower fals positives.
Drug discovery (life sciences) Compounds that have desirable effects. How to bring drugs to the marketplace quickly and effectively.
We had one customer who was spending about five and a half hours building an attribution model. With high-performance data mining, they’re now building it in about three minutes. Plus, we were able to get a factor of about two times more lift, meaning millions of dollars for the customer in terms of return on investment.

Wayne Thompson
Analytics Product Manager, SAS

Perspective: Kelley Blue Book

For years, Kelley Blue Book collected data for one specific goal: the annual publication of its blue book. Data was summarized and aggregated, algorithms applied, all on what amounted to a 12-month cycle.

As the company's kbb.com site evolved, its needs changed – and data mining played a large role in that change. "We needed to change the DNA of our company by moving from a traditional book publisher to an analytics powerhouse," said Dan Ingle, Vice President of Analytic Insights Technology at Kelley Blue Book. "Fact-based decisions have become our competitive strength. Whether or not to utilize analytics was no longer an option."

Kbb.com is now the most visited automotive website among new and used vehicle researchers, with more than 17 million monthly visits. The website automatically generates more than 27 million pricing reports each month.

According to Shawn Hushman, Vice President of Analytic Insights, SAS has played a big role in that success. "As needs arise, analytic experts contribute ideas and answer questions. They also quickly bring in SAS predictive analytics and data mining to develop, deploy and evaluate analytic models. We have slashed our time to results and enhanced our analytic insight."

Read more about Kelley Blue Book

Steps to successful data mining

Explorers use many different methods to uncover riches in their data. Typically, data mining is part of a broader analytic life cycle that includes data exploration, model building, model deployment and other steps. Defining the business problem is a critical step in the overall success of any data mining project. Once the business problem is defined, we recommend a five-step data mining process that works well for our customers:

  • Sample the data by creating a target data set large enough to contain the significant information.
  • Explore the data by searching for anticipated relationships, unanticipated trends and anomalies – to gain deeper understanding and ideas.
  • Modify the data by creating, selecting and transforming the variables to focus your model selection process.
  • Model the data by using analytical tools to search for a combination of data that reliably predicts a desired outcome.
  • Assess the data and models by evaluating the usefulness and reliability of the findings from the data mining process.

Best practices for data mining involve comparing different analytical techniques to determine which will produce the best model and therefore the best prediction. Some of these modeling techniques include decision trees, neural networks, gradient boosting, logistic regressions, memory-based reasoning and rule induction. Data mining also plays a role in the growing field of machine learning SAS Enterprise Miner is designed to handle these and many other techniques.


Recommended data mining solutions from SAS

Want more insights?

Big Data Insights

Big Data

Get more insights on big data including articles, research and other hot topics.

Fraud & Risk Insights

Risk & Fraud

Discover new insights on risk and fraud through research, related articles and much  more.

Marketing Insights

Marketing

Explore insights from marketing movers and shakers on a variety of timely topics.

Back to Top