The untapped potential of time series data mining
Improve predictive models - and decision making - by combining time series analysis and data mining
Financial planning and budgeting, supply chain management, retail replenishment and planning – these are just a few of the critical business functions that benefit greatly from data mining, forecasting and time series analysis, three established disciplines of analytics.
These three disciplines are used in many industries for many different functions, and now leading organizations are recognizing the impact of combining them to create a more powerful brand of predictive analytics. Before we describe the wide array of business advantages that can be gained by integrating data mining, time series analysis and forecasting, let's look at some definitions.
Data mining is a collection of analytical techniques that enable automated search for patterns and associations within a large portfolio of characteristics to find relationships that can be used for improved decision making. For example, based on the characteristics of a customer – such as age, demographics, product portfolio, contact history and others – data mining can be used to identify a set of customers most likely to respond to a specific marketing offer.
Time series analysis and forecasting are used to detect temporal patterns from historical time-dependent data and project the detected patterns (such as trend or seasonality) into the future. For example, time series analysis plays a crucial role in forecasting electricity demand for the utilities industry. Electricity demand follows long-term trends, such as population growth and industrial activity, as well as shorter seasonal cycles for time of year (summer versus winter), day of the week (business days versus weekends), and time of day (peak demand to drive air-conditioning on hot summer afternoons, and low demand in the middle of the night). Good software detects and reconciles the various temporal patterns and provides both the long-range and near-term forecasts that utility companies require.
Time series data mining combines data mining with time series analysis to:
But because the integration of this temporal effect was managed manually, it further complicates the already tedious data preparation. Time series data mining (TSDM) tools in SAS® Enterprise Miner™ 7.1 (included in SAS® 9.3 and available now) automate the data preparation phase to include temporal relationships in predictive modeling. This will help speed up the data preparation, as well as improve the accuracy of predictive models.
Inventory management. Often, time series information is collected on a very granular level in organizations. For example, retailers measure sales of items in a store on the SKU level and in daily time intervals. For stores with thousands of items, this results in a large amount of time series with many records because historical data is sometimes collected over many years. This large amount of data often makes it difficult to extract information relevant for decision making. The new time series data mining tools in SAS Enterprise Miner will help analysts quickly reduce the dimensionality of the problem under investigation and extract signals from the noise. For example, SKUs with similar sales trends can be combined into segments without losing critical information. Time series analysis techniques, such as smoothing, can help compress detailed information into a picture that makes general patterns easier to spot.
Fraud detection. The new similarity analysis provided with the TSDM tools in SAS Enterprise Miner works on the most detailed level in order to spot exceptions to average behavior. Credit card providers can use time series data mining to automate the detection of fraudulent behavior in financial transactions. They do this by comparing many detailed transactional time series against a known pattern of abusive behavior. It is often only in looking across the temporal representation of behavior that undesired behavior becomes apparent. The similarity analysis tool can quickly detect behavior over time with known signatures of fraud and create flags for further investigation if similar patterns are detected.
New product forecasting. A never-ending challenge for consumer goods manufacturers and retailers, new product forecasting situations include: predicting entirely new types of products; new markets for existing products (such as expanding a regional brand nationally or globally); and refinements of existing products (such as "new and improved" versions or packaging changes). All of these require the organization to come up with a forecast of future sales without historic data for the new product. However, by using techniques like similarity analysis, an analyst can examine the demand patterns of past new products with similar attributes to identify the range of demand curves which can be used to model demand for the new product.
There are more business applications than we have space to cover here. As you can see, by integrating data mining with time series analysis and forecasting, organizations can take the next step in extracting more value from their data to improve decision making.
This story appears in the Fourth Quarter 2011 issue of