What do companies like Netflix, Airbnb, Spotify and Lyft know that you don’t know? What do these companies have that you might be lacking?
Sure, they had an initial idea. But we’ve all had good ideas. How did each of these new market heroes move from the seed of an idea to the disruption of an entire industry?
For each of them, it took a combination of strong technological and analytical skills to develop a digital platform that turned data into a fundamental asset for the company.
Now you might be thinking: But we do analytics here too. We have a smart team of data scientists. We know the latest techniques for machine learning and deep learning, and we even attend the best analytics conferences.
But there’s a difference between analytical maturity and technological maturity. Disrupters are both analytically and technologically mature.
They don’t just understand how to build models and use algorithms. They also know how to design systems that incorporate algorithmic thinking into their core. They know how to replicate and tweak models by the thousands, how to deploy models at scale and how to test, iterate, improve and retire models systematically.
Many traditional companies are analytically mature but not technologically mature.
If you’ve done a great job of capturing and analyzing data but come up short on moving those results into action on a large scale, you’re in luck. We’re seeing three clear shifts in the industry that are making it easier for even the most traditional companies to work like disrupters and change the way they deploy analytics. They are:
- Intelligent data preparation.
- Containers for analytics.
- ModelOps for machine learning.
The convergence of these three technologies has the potential to change the way you work with data and analytics – so you, too, can think like a disrupter.
There’s a difference between analytical maturity and technological maturity. Disrupters are both analytically and technologically mature. Shadi Shahin Vice President of Product Strategy SAS
Intelligent data preparation
Data has always been a problem. Data scientists still spend a lot of time manipulating data. And now that you have more – and more complex – data, how will you deal with it? What if we recast data management as a problem with an AI solution?
Intelligent data preparation uses AI algorithms to recognize patterns in data, understand what data belongs together and provide context around data.
Right now, training a data set is usually a manual process that requires a lot of labeling. You take a piece of data or an image and add metadata. You start by saying, this is a cat, this is a dog, this is a barn. You label all that data and then feed it to a training set. In a three-dimensional world of video, you might label plywood, a corner of plywood, the length of the plywood.
Instead of requiring humans to tag and label the data, intelligent data preparation offers auto-tagging and auto-labeling. It uses reinforcement learning to create reason-based algorithms. It can even teach models to learn from other models and then to retrain the model itself. It’s continuously learning based on the decision it’s making.
Intelligent data preparation isn’t limited to image data. You can also use it across the board on streaming data, static data, master customer data – and more. It can join columns, clean fields and even combine warehoused data with streaming data – all automatically.
Disrupters have these capabilities built into their systems already. They are working with traffic data, hospitality data, streaming data – and more – and processing it intelligently for thousands of business decisions every day.
Containers for analytics
In the world of IT, containers are the latest essential for deploying software in cloud environments. They fit nicely into the way IT likes to test and run software because they deploy software faster, manage upgrades simply and make it easy to combine different software packages. You can try things out without forcing a hard upgrade.
For software providers, it also makes it easier to build packaged deployments, integrate with other packages and add more automation into a system to self-optimize for customer workloads or other requirements.
Ultimately, containers will help to democratize the use of advanced analytics and lower the barrier to entry for trying new software products. They make it very easy for you to get software at the pace you want to consume it. And you can experiment with new algorithms or new techniques without a lot of upfront risk or expenses.
Imagine, for example, that you’ve deployed a machine learning algorithm for a next-best offer. When a new algorithm becomes available for that same task, you can automatically download that and funnel 5% of your traffic through this algorithm. Then you let the machine tell you if it’s performing better than the previous algorithm. If it is, the machine starts funneling 10% of visitors through the new algorithm, then 25% and so on until you’ve replaced the algorithm altogether. If, on the other hand, the results are not improved, it returns the algorithm and goes back to how it was doing it before.
The disrupters are constantly testing new algorithms and running new programs in this manner. It’s one way they keep improving with small increments at scale. With containers, you’ll be able to do the same.
ModelOps for machine learning
How do you cycle machine learning models from the data science team to the IT production team? Do you do regular deployments and updates? Do you watch for models to degrade and take action? Do you put all of your best models into production?
ModelOps is a process you can use to move models from the lab to validation, testing and production as quickly as possible while ensuring quality results. It helps you manage and scale models to meet demand and continuously monitor them to spot and fix early signs of degradation.
If you use this ModelOps process, you will put more models into production and see continued results:
- Data: Explore and access data from a trusted and secure source.
- Develop new models: Create models with deployment and monitoring in mind.
- Register models: Preserve data lineage and track-back information.
- Deploy models: Improve deployment speeds with close collaboration between data scientists and IT.
- Monitor models: Consistently track performance, and then retrain or replace models as needed.
This process is designed to be sensitive to data fluctuations, model bias and model degradation. It improves time to deployment for new models and ensures regular updates.
A solution for ModelOps can help you compete with the disrupters who have perfected this process while deploying and monitoring models by the thousands.
Work like a disrupter
The three technology developments discussed above are changing the way we manage data and roll out analytics projects. If you’re already analytically mature, you can use these new technologies to become technologically mature as well. Use these tips as first steps on your journey:
- Pick an area where you have ongoing data management issues and try to tackle them with intelligent data preparation.
- Look at where you have complex software and code footprints and see if containers can help simplify your infrastructure for analytics.
- Examine your existing data science work and see where you can benefit from a governed process for managing, deploying and updating models at scale.
If, on the other hand, you are technologically mature with a strong IT infrastructure, but not analytically mature, you can benefit from these technologies too:
- Consider what outside data sources you can bring in if you used intelligent data preparation, or what advanced algorithms you might be able to develop if your data was more intelligent.
- Look into using containers for analytics, not just for your operational systems. The same best practices you’ve learned for automation and portability in those systems will help your analytics efforts too.
- Determine where you could make the biggest impact with machine learning models that are managed as a corporate asset and start to experiment with advanced analytics while keeping your software development strengths intact.
No matter where you are on the continuum from traditional to disrupter, you can benefit from exploring the technologies described here.
It can be a challenge to advance on the scale of technological maturity, but I’ve seen it happen. I’ve watched markets shift and opportunities open up as traditional companies took their analytical skills and paired them with new technological skills. It could be you next.
About the Author
As Vice President of Product Strategy, Shadi Shahin is responsible for leading Product Management, Price and Offering Management, as well as the Enterprise Excellence Center teams. His background as an IT professional began at IBM more than 25 years ago eventually leading Wolters Kluwer, and most recently, RedHat. Along the way, Shahin’s teams pushed the boundaries within data, analytics, and next-generation technologies. This included creative integrations of SAS into an open architecture ecosystems. His guiding principle at is to infuse both a customer and practitioner perspective into a pragmatic and collaborative SAS roadmap.
- Modern manufacturing's triple play: Digital twins, analytics & IoT IoT-powered digital twins revolutionize manufacturing with real-time data analysis, predictive maintenance and optimized production. Discover their transformational impact.
- Resilience in the face of unpredictabilityUnpredictability can “shatter and reshape” a society. And in these unpredictable times, it is important to remain resilient and be prepared to bounce back. This article explores what it truly means to be resilient, how to build it, and how analytics can help you act when your resilience is tested.
- A data scientist’s views on data literacyGet a data scientist and teacher's perspective on the value of having foundational knowledge so you can more easily tell data fact from data fiction.