Hadoop, Big Data and SAS®
By now, everyone knows big data is here to stay. And organizations are beginning their journey toward big data analytics. Why? Because you need to realize value from the massive and diverse data volumes currently held in platforms like Hadoop. Combining the benefits of Hadoop with the business analytics power of SAS helps your data scientists transform big data into big knowledge.
But big data efforts aren't confined to just accessing information. That's why SAS products and services create seamless and transparent access to the Pig and Hive languages and the MapReduce framework. The SAS environment also provides a visual and interactive Hadoop experience, making it easier to gain insights and discover trends. Powerful analytical algorithms help extract valuable information from the data, while in-memory technology gives you a faster way to process all of this data. And with integrated and automated deployment of analytical models, you can now score data directly in Hadoop for faster results.
From Data to Decisions Using SAS® and Hadoop
How SAS® Can Help
SAS support for Hadoop spans the entire data-to-decision process and centers on a singular goal – helping you know more, faster, so you can make better decisions. SAS provides:
- Data management for analytics.
- Exploration and visualization for easy insights.
- Deep analysis and analytics model development.
- Deployment and execution of the analytics model for decision making.
- Easily access and use data stored in Hadoop. SAS/ACCESS® software provides fast, efficient access to data stored in Hadoop via HiveQL. You can access Hive tables as if they were native SAS data sets. Then apply text mining and predictive analytics to the data to gain and share new insights.
- Maximize Hadoop's distributed processing capabilities. Your SAS programmers can submit MapReduce, scripting and HDFS commands from within Base SAS. SAS also supports external file references, allowing you to conveniently find and use Hadoop files from any SAS product.
- Better manage data stored in Hadoop. One issue plaguing Hadoop implementations is the lack of – or immaturity of – tools for managing deployments. SAS Data Management provides an intuitive GUI so you can easily build job flows that use Pig, MapReduce and HDFS commands and Hive queries. SAS Data Management streamlines Pig and MapReduce code generation through visual editing tools and a built-in syntax checker. You also get the added advantages of metadata management, data lineage and security features.
Explore and Visualize
- Quickly visualize your data stored in Hadoop, discover new patterns and publish reports. SAS Visual Analytics is an in-memory solution for exploring data quickly. It helps you identify opportunities for further analysis and share results via Web reports or mobile devices.
SAS Products Working with Hadoop
Explore and Visualize
Analyze and Model
Deploy and Execute
Analyze and Model
- Apply domain-specific high-performance analytics to data stored in Hadoop. SAS High-Performance Analytics provides in-memory capabilities that let you develop analytical models using all data, not just a subset. Produce more accurate and timely insights. Run frequent modeling iterations. Use sophisticated analytics to quickly get answers to questions you never thought of – or had time to ask.
- Uncover patterns and trends in Hadoop data with an interactive and visual environment for analytics. SAS Visual Statistics (coming March 2014) enables multiple users to concurrently solve complex problems and identify new opportunities by uncovering patterns and trends faster than ever. Using multiple statistical algorithms and machine learning techniques, you can surface unexpected insights quickly. Base your decisions on fact-based insights from all of your data.
Deploy and Execute
- Automatically deploy and score analytic models in the parallel environment. SAS Scoring Accelerator for Hadoop automates model deployment inside the cluster and allows you to score new data in Hadoop without moving the data. This speeds ad hoc modeling and scoring of new data for faster results.
Go from Data to Decisions in a Single, Interactive Environment
- Use a single interactive programming environment for your entire data-to-decision process. A new interactive programming environment (coming late 2013) lets multiple users concurrently manage data, transform variables, perform exploratory analysis, build and compare models and score – with virtually no limits on the size of the data stored in Hadoop. In-memory data persistence eliminates multiple data loading steps and multiple passes through the data. The result? Faster interactive ad hoc analysis and data management and massive productivity gains.
Why You Need SAS® and Hadoop
- Comprehensive support for Hadoop. SAS/ACCESS not only retrieves big data stored in HDFS, but also allows you to incorporate and use other capabilities, such as the Pig and Hive languages and the MapReduce framework.
- Flexible architecture. Because SAS is focused on analytics, not storage, we offer a flexible approach to choosing hardware and database vendors. We work with our customers to deploy the right mix of technologies, including the ability to deploy Hadoop with other data warehouse technologies.
- Complete data-to-decision support. SAS supports the entire analytics life cycle, from data preparation and exploration to model development, production deployment and monitoring.
- Transparent, collaborative, interactive and iterative. SAS enables you to analyze large, diverse and complex data sets in Hadoop within a single environment – instead of using a mix of languages and products from different vendors.
Ready to learn more?
Call us at 1-800-727-0025 (US and Canada) or request more information.