An online jeweler sent out an email to a list of friends and customers about her latest line of earrings. She sold 25 pairs within a day. Her web site then soared to the top of the Google results list when visitors searched on relevant terms. Why? Due to the increased traffic to the jeweler’s site, Google’s algorithm assigned greater relevancy to the site for related searches and presented it as a top resource to check out when searching on “silver earrings,” for example.
This real-time, instant processing of information is a critical component of high-performance analytics to make sense of big data. Complex event processing filters through vast amounts of data in real-time to identify data relevance and pair datasets together for analysis. There’s a component that is needed here that cannot be overlooked, however – that analytical models must be refreshed constantly to keep up with the changing data. So, analytics actually refreshes the model upon each search and query to align with current information and produce deadly accurate – and fast! – results. Without continual maintenance (and improvement) of models, they’ll become stale and irrelevant.
Consider an example from a Text Analytics World keynote on text analytics and mining big data by Usama Fayyard, Chairman & CTO of ChoozOn, Former Chief Data Officer, Yahoo!. One of the largest online destinations in the world, Yahoo! dealt with more than 25 terabytes of data each day. Fayyard’s challenge was to make sense of this data to surface relevant advertisements to each visitor based on their online behavior. Data comprised events like, what sites they visited, what kinds of news they scanned and what key words they used in emails that lend themselves to their personalities. All told, a Yahoo user “DNA string” of behaviors captured about them contained 2500 categories. So the first step Fayyard took was to reduce those categories to a manageable 300. (With high-performing analytics tools available now, that is not even necessary.)
He then analyzed predictive patterns, built models, scored each user and came up with a list of targeted users who would be interested in certain types of ads. In this case, targeted visitors were determined to be in the midst of shopping for cars. So, automobile ads ran on every page those users accessed.
Fayyard emphasized the need to maintain models, saying “it becomes a huge factory” of refreshing them so they remain current and effective. As a result of this targeted approach, Yahoo! revenue increased from 20 million to 500 million. Now that’s refreshing.