In the previous post, we examined the “new normal” for the CIO and the need to demonstrate more immediate value from data. However, introducing analytical appliances or implementing Hadoop – while ignoring the critical information infrastructure aspects that support the collection and management of data – may lead to questionable results when the level of trust in the usability of the information can easily be challenged.
That means that increased attention to fundamental capabilities for information management must accompany any adoption of new technology. As opposed to implementing data management components on a project-by-project basis, the time has come to view information management as an organizational business imperative.
Business user expectations for data accessibility, availability, and quality are approaching the sustained need for standard services, like telephony and network access. This “dial-tone” approach to information management services establishes a baseline, enterprise-wide capability for data utility, and includes components for:
- Data integration – What used to be called extraction, transformation, and loading (ETL) has evolved beyond the original scope of data warehouse population to include the end-to-end mechanisms for data sharing, access, and delivery.
- Data federation and virtualization – The desire for real-time integrated analytics has ramped up the demand for high-speed data access to heterogeneous sources. Data federation enables semantically correct mappings across data assets and makes heterogeneous data access transparent to the end users. Virtualization smooths the delivery and presentation of federated as well as provides caching to make access times predictable.
- Event stream processing – With the desire to absorb data from numerous sources, the business may want to apply filters or trigger actions based on streaming data. Event stream processing provides the infrastructure to support these types of actions.
- Managed metadata – Merging a variety of data sources without a common agreement to definitions and meanings will always lead to confusion. Establishing a metadata management practice using the right components will help alleviate some of these concerns.
- Data quality management – Any business environment will be compromised without establishing a level of trust in the usability and quality of the data. Parsing, standardization, and cleansing all contribute to a predictable level of data quality.
- Data governance – These technologies enable inspection, monitoring, and reporting of compliance with data quality rules and policies. In addition, tools to enable data stewards to be alerted to data issues and to monitor the progress of remediation help operationalize the deployment of corporate data policies.
In my next post, we will look at the demand for analytics incorporating a wide variety of data sources, including social media data, machine-generated data, as well as the desire to extract information from mixed-format content (such as documents and web sites containing text, images, video, etc.). We will also examine how that demand drives the need for predictable and trustworthy information management.