SAS | The Power to Know

White Paper

Data Lakes: Purposes, Practices, Patterns, and Platforms

TDWI Best Practices Report

About this paper

When designed well, a data lake is an effective data-driven design pattern for capturing a wide range of data types, both old and new, at large scale. By definition, a data lake is optimized for the quick ingestion of raw, detailed source data plus on-the-fly processing of such data for exploration, analytics and operations. Even so, traditional, latent data practices are possible, too.

Organizations are adopting the data lake design pattern (whether on Hadoop or a relational database) because lakes provision the kind of raw data that users need for data exploration and discovery-oriented forms of advanced analytics. A data lake can also be a consolidation point for both new and traditional data, thereby enabling analytics correlations across all data.

To help users prepare, this TDWI Best Practices Report defines data lake types, then discusses their emerging best practices, enabling technologies and real-world applications. The report’s survey quantifies user trends and readiness for data lakes, and the report’s user stories document real-world activities.

About SAS

SAS is a global leader in AI and analytics software, including industry-specific solutions. SAS helps organizations transform data into trusted decisions faster by providing knowledge in the moments that matter. SAS gives you THE POWER TO KNOW®.  

Have a SAS profile? To complete this form automatically Sign In

*
*
*
*
 
*
 

All personal information will be handled in accordance with the SAS Privacy Statement.

 
  Yes, I would like to receive occasional emails from SAS Institute Inc. and its affiliates about SAS products and services. I understand that I can withdraw my consent at any time by clicking the opt-out link in the emails.