Answering your basic analytics questions
Many companies will tell you they use analytics all the time. And they do. Analytics enables them to ask questions of their data in order to tell investors, stakeholders and customers what happened. For example, they know that profits took a hit because raw material costs shot up, or that revenues increased because a new line of clothes sold particularly well. But that's where many companies stop.
While there is value in knowing what happened, it's often necessary to dig deeper to make competitive decisions. In addition to asking what happened, it is also important to ask questions like:
- Why is this happening?
- What if these trends continue?
- What actions are best to take based on the trends?
- What will happen next?
- What is the worst that can happen?
To answer these questions, you need to apply analytical methods from the areas of statistics, forecasting, data mining and operations research. Since it's not uncommon for these terms to be used interchangeably and inaccurately, I often get asked to explain the differences. Keep reading for my answers.
What's the difference between data mining, predictive modeling and predictive analytics?
Data mining was the buzzword about 10 years ago, but the terms predictive modeling and predictive analytics have become more popular recently. Are they all the same thing? Not exactly, but they are all related.
Data mining has been defined in a lot of ways, but at the heart of all of those definitions is a process for analyzing data that typically includes the following steps:
- Formulate the problem.
- Accumulate data.
- Transform and select data.
- Train models.
- Evaluate models.
- Deploy models.
- Monitor results.
Predictive analytics is an umbrella term that encompasses both data mining and predictive modeling – as well as a number of other analytical techniques. I define predictive analytics as a collection of statistics and data mining techniques that analyze data to make predictions about future events. Predictive modeling is one such technique that answers questions such as:
- Who's likely to respond to a campaign?
- How much do first-time purchasers usually spend?
- Which customers are likely to default?
Predictive analytics is a subset of analytics, which more broadly includes other areas of statistics like experimental design, time series forecasting, operations research and text analytics.
Forecasting or predictive modeling
I run into decision makers all the time who have a hard time understanding the difference between forecasting and predictive modeling. Here's a quick analogy I use to illustrate the difference:
- Forecasts tell you how many ice cream cones will be sold in July, so you can set expectations for planned costs, profits, supply chain impacts and other considerations.
- Predictive models tell you the characteristics of ideal ice cream customers, the flavors they will choose and coupon offers that will entice them.
If your goal is to do a better job of buying raw materials for the ice cream and to have them at the factory at the right time, your company needs a forecasting solution. If the marketing department is trying to figure out how and where to market the ice cream, it needs predictive modeling.
Consider these real-world forecasting examples. The hospitality industry uses forecasting to determine demand for particular rooms or properties. Financial companies use it to generate accurate sales forecasts, which feed into the planning process. Retailers create forecasts to manage pricing, staffing and inventory.
Predictive modeling delivers a different set of answers. In retail, predictive modeling identifies the most profitable customers and the underlying reasons for their loyalty. In finance, credit scoring is a type of predictive modeling used to grow customer profitability and reduce risk exposure. In the life sciences, it helps companies find promising new molecular drug compounds.
Is optimization an overused term?
Now that you understand the differences between forecasting and predictive modeling, let's move on to optimization. It's not a stretch to say the word optimization is overused. A quick search on Amazon.com reveals more than 4,100 books that include optimization in the title. Everybody wants to optimize something – but what does it really mean to apply optimization techniques to a business problem?
On the simplest level, optimization is all about making better decisions that cross multiple dimensions of your business in the presence of constraints. Typical business constraints include demand limitations, government regulations and budget limitations.
Optimization considers those constraints when answering the question, "What is the best that can happen?" In manufacturing that could mean, "What mix of products will maximize our profits given that we can't expand our plant beyond a specific size?" In retail that might mean, "How can we reduce shipping costs yet satisfy demand at our stores? Should we close stores? (Which ones?) Or open stores?" If you can't expand your ice cream manufacturing facility right now, but you need to maximize what you can make, optimization can help you decide how much butter pecan or chocolate swirl to make in each plant.
So where do I start?
Organizations benefit from analytics regardless of which capability they use. The key is using the best techniques to answer the crucial questions, so start with the business problem and figure out what is needed to solve it.
As manager of the analytics product management team, Tonya Balan is responsible for providing strategic direction for all aspects of SAS' suite of analytical products. She holds a PhD in statistics and has served on the faculty of the North Carolina State University Department of Statistics.