Unearthing untold trends and exploring future outcomes
Where do today’s explorers roam when there is nothing left to explore on land? When even large stretches of the cosmos are charted and mapped? Where can you still discover the unknown? Encounter riches that no one has seen before? And make discoveries that can improve your organization?
The answer? Your data. And you can unearth what's hidden there through data mining.
What is data mining?
Some of today’s greatest explorers are data miners. They use data mining and predictive analytics to discover previously unknown patterns and to predict future outcomes. The most significant and unexpected results might change the very course of an organization.
How are these feats accomplished? Data mining uncovers patterns in a sample set of data and then looks for the same pattern across a much larger universe of data. A final step applies predictive modeling to forecast outcomes.
Common data mining techniques include:
- Descriptive modeling. Descriptive models classify elements into groups based on patterns and relationships. These models are used to support the development of predictive models.
- Predictive modeling. Predictive models analyze patterns and past performance to predict the probability of a desired outcome. Examples of applied predictive modeling include ratemaking and credit scoring.
Data mining examples
Learn how you can use data mining to identify trends, patterns and relationships, while predictive analytics can be used to predict future outcomes.
Everyone has data to explore. And every industry can benefit from data mining. Data mining methods can be applied to a variety of issues in any industry, from segmentation and targeting to fraud detection and credit risk scoring.
Common applications for data mining across industries include:
- Profiling and segmentation – predicts customer behaviors and needs by segment.
- Lifetime value – predicts drivers for future value, including margin and retention.
- Credit scoring – predicts the creditworthiness of new and existing customers.
- Market basket analysis – predicts what products are likely to be purchased together.
- Asset maintenance – predicts the real causes of asset or equipment failure.
- Health management – predicts which patients are at risk of chronic, preventable illnesses.
- Fraud management – predicts unknown fraud cases and future risks.
- Drug discovery – predicts drug compounds that have desirable effects.
We had one customer who was spending about five and a half hours building an attribution model. With high-performance data mining, they’re now building it in about three minutes. Plus, we were able to get a factor of about two times more lift, meaning millions of dollars for the customer in terms of return on investment.
Analytics Product Manager, SAS
Perspective: Kelley Blue Book
For years, Kelley Blue Book collected data for one specific goal: the annual publication of its blue book. Data was summarized and aggregated, algorithms applied, all on what amounted to a 12-month cycle.
As the company's kbb.com site evolved, its needs changed – and data mining played a large role in that change. "We needed to change the DNA of our company by moving from a traditional book publisher to an analytics powerhouse," said Dan Ingle, Vice President of Analytic Insights Technology at Kelley Blue Book. "Fact-based decisions have become our competitive strength. Whether or not to utilize analytics was no longer an option."
Kbb.com is now the most visited automotive website among new and used vehicle researchers, with more than 17 million monthly visits. The website automatically generates more than 27 million pricing reports each month.
According to Shawn Hushman, Vice President of Analytic Insights, SAS has played a big role in that success. "As needs arise, analytic experts contribute ideas and answer questions. They also quickly bring in SAS predictive analytics and data mining to develop, deploy and evaluate analytic models. We have slashed our time to results and enhanced our analytic insight."
Read more about Kelley Blue Book
Steps to successful data mining
Explorers use many different methods to uncover riches in their data. Typically, data mining is part of a broader analytic life cycle that includes data exploration, model building, model deployment and other steps. Defining the business problem is a critical step in the overall success of any data mining project. Once the business problem is defined, we recommend a five-step data mining process that works well for our customers:
- Sample the data by creating a target data set large enough to contain the significant information.
- Explore the data by searching for anticipated relationships, unanticipated trends and anomalies – to gain deeper understanding and ideas.
- Modify the data by creating, selecting and transforming the variables to focus your model selection process.
- Model the data by using analytical tools to search for a combination of data that reliably predicts a desired outcome.
- Assess the data and models by evaluating the usefulness and reliability of the findings from the data mining process.
Best practices for data mining involve comparing different analytical techniques to determine which will produce the best model and therefore the best prediction. Some of these modeling techniques include decision trees, neural networks, gradient boosting, logistic regressions, memory-based reasoning and rule induction. Data mining also plays a role in the growing field of machine learning SAS Enterprise Miner is designed to handle these and many other techniques.