SAS® In-Memory Statistics for Hadoop
A single interactive programming environment for analytics on Hadoop
Getting insights out of Hadoop in a timely manner requires a different approach. You need in-memory analytics and interactive analytical data preparation, exploration, modeling and deployment. For precise answers instantly.
Delve deep into Hadoop for fast, accurate insights.
Apply proven state-of-the-art statistical algorithms and machine-learning techniques to find the best answers. You can explore and use multiple analytic approaches to reveal insights and make high-impact decisions.
Increase productivity for your data scientists.
Multiple users can concurrently and interactively analyze big data in Hadoop using the fast, in-memory analytical programming language. Prepare, manipulate, transform, explore, model, access and score data – all within Hadoop.
Take advantage of a scalable environment.
Until now, statisticians and data scientists had to piece together different programming languages or products to access, prepare, model and score data in Hadoop. And when it came time to operationalize models, the software couldn't scale. No more. From data preparation and exploration to model building and deployment, our solution is proven, tested and accurate – and can scale to your production environment.
Avoid unnecessary, multiple passes through the data.
Our in-memory infrastructure running on top of Hadoop eliminates costly data movement and persists data in-memory for the entire analytic session. This significantly reduces data latency and provides rapid analysis at lightning-fast speeds.
Cloudera Chief Technologist Eli Collins talks about the SAS solution for data scientists. And the benefits the SAS partnership brings to Hadoop ecosystems.
Demos & Screenshots
- Interactive programming. Lets multiple users concurrently analyze large amounts of data stored in Hadoop by interactively submitting SAS code through the web browser developer environment, SAS Studio.
- In-memory analytical processing. Get fast analytic computations that are optimized for multiple passes across a distributed cluster.
- Persists data in-memory. Gain speed and reduce latency because data is held in-memory.
- Analytical data preparation. Access and manipulate data, transform and create variables, and perform exploratory analysis.
- Model development. Quickly create, evaluate and compare multiple statistical models.
- Statistical algorithms and machine-learning techniques. Uncover patterns and trends faster than ever before with a huge breadth and depth of analytical techniques.
- Text analytics. Analyze your unstructured (and structured) data using a wide range of text analysis techniques.
- Recommendation system. Generate personalized, meaningful recommendations in real time with a high level of customization.