Want more Insights from SAS? Subscribe to our Insights newsletter. Or check back often to get more insights on the topics you care about, including analytics, big data, data management, marketing, and risk & fraud.
Big data privacy: Four ways a data governance strategy supports security, privacy and trust
By Daniel Teachey, Insights editor
Do a web search for “big data” and you can find countless articles about “delivering value” from big data. While figuring out what to do with this data is important, a topic that isn’t quite as hot – but might be much more important – is big data privacy, which focuses on whether big data is protected in compliance with your organization’s existing standards.
Big data privacy falls under the broad spectrum of IT governance and is a critical component of your IT strategy. You need a level of confidence in how any data is handled to make sure your organization isn't at risk of a nasty, often public data exposure. That extends to privacy for all your data, including big data sets that are increasingly becoming part of the mainstream IT environment.
Privacy is also related to issues like data monetization. If the data that you have isn’t secure, high-quality or fit for purpose, can you trust the monetary value placed on that data? And, as the amount of data grows, do you have a strategy for larger privacy efforts, or big data privacy, in your organization?
The recommended approach... is to blend your business rules and IT rules. If you can accomplish this collaborative effort through the use of governance solutions to establish a big data privacy framework within your IT environment, then all the better.
Big data privacy vs. traditional data privacy standards
Of course, data privacy is not a new topic. By the 1970s, it was a recognized concern for issues such as medical records or financial information. In those early days, the first data privacy principles adopted what were often called “Fair Information Practices” (FIP).
The FIP efforts in organizations followed five tenants.
- Openness. There should be no systems for collecting personal data that are kept secret.
- Disclosure. Organizations should provide a way for individuals to learn what information is available and how it is used.
- Secondary usage. Information collected for one purpose should not be used for another purpose without the consent of the individual. (Note: this was the hardest to implement – and thereby became the least practiced tenant).
- Correction. Individuals should have the ability to correct or amend erroneous information.
- Security. Any organization creating, maintaining, using or disseminating identifiable personal data must assure the data is being used correctly and must take precautions to prevent misuse.
With new privacy-based regulations like HIPAA and Sarbanes-Oxley, more organizations have a more defined business need to safeguard data privacy. This has led to an expansion beyond the tenants of a FIP approach.
As our domain of data has evolved, a new focus is tracking the source of information (also known as lineage). It’s also important to understand the quality of the actual information and the usage of the information as it pertains to personal privacy and industry compliance criteria. This gets more complicated as data becomes an asset both for the organization and the consumer.
The push toward self-service and the need for big data privacy
As any other privacy or security issue, you must balance big data privacy issues against your business goals. Why do you collect and manage data in the first place? You’re typically using it to fuel an operational effort (supporting sales) or an analytical effort (learning who to sell to).
For e-commerce or online customer experiences, that data is more visible to the customer throughout their journey. As a result, the data can have a more direct impact on the bottom line. After all, without a good e-commerce experience, customers may choose to go elsewhere. Similarly, a poor online support program may lead to increased churn.
This “transparency” comes with some risk. More self-service interactions with customers means you are collecting and packaging more information about customers about their accounts, their purchases and their preferences. More data can lead to a better customer experience, but it can also put you at risk. There is simply a greater risk of exposure of personal or confidential information.
As a result, data and IT governance efforts are finding a new push as organizations begin to collect data for more public consumption. And now business and IT, once mortal enemies (almost), are now realizing that data is everyone’s responsibility.
Preparing for privacy in a big data world
When planning big data privacy efforts, a starting point is to understand the sources of data and how this data is used. As we all know, that conversation rapidly goes in the direction of how we should or should not use or exploit the data.
However, there is tendency to avoid the delicate subject of how to support privacy of the individual and how to protect data in an increasingly digital world. The complicating factor is how to keep a balance between:
- The value to end users.
- The level of privacy and protection that's necessary for both you and your customer.
This issue has to be addressed if you want your digital business practices to be seen as credible to your customers.
Strike a balance: A best practice checklist
The recommended approach for clarifying these concerns is to blend your business rules and IT rules. If you can accomplish this collaborative effort through the use of governance solutions to establish a big data privacy framework within your IT environment, then all the better.
Here are a few data governance best practices as they relate to big data privacy:
- Define what data governance means – to your company and to your project. When it comes to big data, you don’t need to develop a separate data governance program or framework. You just need a data governance program and framework that support big data.
- Know your culture. One size does not fit all. Some organizations are better suited for a top-down governance approach, while others will work better from the bottom up.
- Design your data governance framework. Identify the “what” and “how” before specifying the “who.” Make use of existing committees and processes.
- Treat data governance as a long-term program. Implement it as a series of tightly scoped initiatives. Plan for the activities and resources required to execute and maintain governance policies.
Organizations that already embrace centralized or shared services that are integrated with functional business processes will have a less difficult path than those starting from scratch. However, the effort to establish meaningful and sustainable data governance and management will still:
- Require a business context considered relevant and valuable to the end users.
- Make mistakes that may require multiple attempts before results are sustained.
- Depend on a determined commitment to achieve the vision of data as a corporate asset, and a willingness to learn from mistakes and try again.
Of course, there will always be debates, both within an organization and in the market overall, around governance, security and trust. Regardless of the details, it’s vital to have a big data privacy effort in place. These steps should be addressed during the design and implementation process – and as part of reviews and proof-of-concept trials – to make sure they fit in your big data environment.