“I have travelled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won’t last out the year.” (Editor in charge of business books for Prentice Hall, 1957)
Whereby the Analytics Isle tends to be a popular destination for marketers on the big data journey, you really won’t find them flocking to the nearby Processing Isle. This highly active island has much to offer—like special territories for batch, real-time, and streaming data—but marketers aren’t typically interested in how data is processed, as much as they’re interested in what marketing data can be processed and how fast. We keep them happy with timely, reliable, and relevant data. How the data gets there, many don’t care or need to care.
Regardless, marketers who have been in the industry awhile have witnessed the remarkable speed at which data warehousing technologies have advanced over the years. Nowadays, not only do we have options on how to process our data—such as grid computing, in-database, in-memory, and appliances—we also have much greater control over the activity in our data warehouse and analytical ecosystems. With these advancements, we’ve been able to increasingly optimize the data warehouse around mixed workloads, and marketers are undeniably reaping the benefits.
A Big Data Best Practice for Processing Data
Even with the significant technological advancements in traditional systems, big data technologies have changed the playing field for processing data of all shapes and sizes. In fact, the need to process high-volume, high-velocity, and high-variety data (otherwise known as the 3Vs of big data) was a key driver in the development of these big data technologies.
As a result, take advantage of the processing power of big data technologies is the new battle cry for big data. With technology options like Hadoop—an open source project designed to address the storage and processing requirements for big (and small) data—we can easily process semi-structured and unstructured data that we can’t or don’t want to store in our traditional system. Or we can pre-process traditional (or big) data in Hadoop before storing it in a data warehouse. Or we can… You get the idea.
Key Takeaways for Marketers
- Data processing did not die in 1957.
- The terms “big data” and “Hadoop” are not synonymous. Hadoop is just one of many big data solutions.
- Hadoop can process all your data—unstructured, semi-structured, and/or structured.
- Feed the mermaid in the South Bay of the island. She’s nice.
- Many traditional and big data software vendors, including SAS, have integrated Hadoop into their big data solutions.