Building a great analytic model is like building a great house. You need to be mindful of its future inhabitants and create a flawless construction plan to ensure it can be used for generations to come. Even if you’re lured by the promise of ocean views and a great tan, you should still heed the old saying and never build a house on sand.
No matter if you’re building a house or an analytic model, it’s easy to get caught up in the grandeur of what you might achieve. But enthusiasm won’t replace a rock-solid foundation. You must plan for analytic implementation. It may be tempting to just start building by the seaside and hope for the best, but tread carefully. Your project may collapse later on.
Here are some considerations to get you started:
- Who needs to access the results? Will the results be provided to a case manager or investigator?
- What do they need to see? Do I need to push out a score or a list to them? Do they need to know why? Do they need to see (or are they allowed to see) all of the data or only those cases that are relevant?
- Do we need to be able to explain and defend what the model is doing, or is a score enough?
- What systems do we need to integrate this solution with? Are they working in a reporting or case management system? Or an inventory planning system? Can we extend it? Do we need to add case management?
- Do we need to view results in real time? Is batch the way to go?
- Will we likely need to use a hybrid approach? Or will we be using one model?
- Are we going to score multiple things at once (like a recommendation engine), or are we predicting one outcome at a time?
- What data will be available at the time you want to score?
- What model performance is acceptable?
- How will we measure success?
The bottom line is that all advanced analytics go out of date and must be updated, so making sure that you plan for this ahead of time is essential. In some cases, especially in online retail, they need to be changed daily; others may not even require an annual refresh. Unfortunately, many organizations fail to update their models because it is too difficult. Not the model building part, but the redeployment part.
For example, a top US credit company used to manually recode all of their models to run in their operational environment. Once the model was built, it took six months to implement. Fraud prevention models are often used for months and years before they are updated — far past their viability. Clearly, as soon as fraudsters find your vulnerability, they find a different route to continue to commit fraud, leaving analytic models in the dust. If they had considered this part of the equation from the beginning, they could’ve saved themselves considerable effort.
It may seem daunting, but resist the urge to start working on your project before fully considering these questions. If you’re still unconvinced, listen to this cautionary tale: Last week, I spoke with a gentleman who had developed a very complex model in an open-source tool and was trying to implement it. The only way he could do this was to manually recode his work. Manually! Had he started the process with the end in mind, he would have saved his agency time and money.
This should almost go without saying, but hard-coded deployments should be avoided. In order to have long-term success with an analytic solution, an organization should select technologies that are preconfigured to work together, designed to monitor performance over time and alert you when changes should be made.
Next week I’ll focus on measuring your analytics project so be sure to follow the Myths and Realities of Successful Analytics series for more. In the interim, if you are unfamiliar with the Top 10 Data Mining Mistakes according to John Elder, Elder Research, they are definitely worth a read.