Now that we have understood why organizations need Sensitive Data Intelligence and what benefits it serves, let’s look at the adoption process stepwise.
The first step in Sensitive Data Intelligence is to gather and build a catalog of all cloud-native and non-native data assets. This data could be across SaaS applications, structured and unstructured IaaS data stores across multiple cloud providers, or on-premises.
The following steps help organizations build their comprehensive data asset catalog:
Every data asset has various metadata that can be classified into business, technical, and security categories. Sensitive Data Intelligence provides native connectors and REST-based APIs to scan and extract metadata, including all business, technical and security metadata associated with each asset. With this metadata, organizations can determine how their Personally Identifiable Information (PII), Personal Health Information (PHI), and similar sensitive data are protected and governed.
Once all assets and their metadata have been cataloged, the next step is to enrich these assets with insights about sensitive data stored in them. Sensitive data is a specific set of personal data that requires additional protection compared to other data types. Since sensitive data needs to be protected and managed separately from different kinds of personal data, it is paramount for organizations to detect and identify all sensitive data stored in their data assets.
Let’s look into some of the types of personal and sensitive data.
Since personal and sensitive data is distributed across hundreds of data assets, the process of finding specific data attributes can be highly complex and time-consuming. Sensitive Data Intelligence helps organizations find specific data attributes within minutes across all structured and unstructured data stores. It also allows organizations to detect unique attributes that have specific requirements under global privacy laws.
This particular step involves detecting sensitive data in structured and unstructured data stores using in-built data attributes or custom attributes via a comprehensive detection engine. It has the following components:
Sensitive Data Intelligence detects sensitive files in unstructured data stores and categorizes them across coarse-level and fine-level document categories such as academia, legal, financial, human resources, and more. Document types can vary across research papers, medical consent forms, insurance forms, tax forms, financial statements, and custom ones that are proprietary to a specific organization. They can contain sensitive information such as social security numbers, credit card numbers, driver’s license numbers, and more. SDI leverages various purpose-built AI/ML techniques to achieve highly accurate & fine-grain document classifications.
Sensitive Data Intelligence leverages various AI/ML techniques that fuse numerous signals to provide highly accurate column classifications across structured data stores. This enables organizations to visualize all the sensitive data discovered in any of their structured data stores. It involves searching and finding data elements across all structured data systems in specific databases, tables, and columns, using powerful filters. These techniques apply automatically to custom data types and CSV, Avro, and other structured files.
Once sensitive data has been discovered from structured and unstructured data stores, the next step in Sensitive Data Intelligence is to enhance the sensitive data with automated classification and tagging. Sensitive Data Intelligence leverages machine learning technologies and natural language processing to deliver highly accurate auto-classification of datasets and data labeling. An extensible policy framework is used to automatically apply sensitive labels and metadata to files/documents for various use-cases.
The following steps help you achieve this:
Once your asset catalogs have been enriched, the next step is to manage your security posture across your multi-cloud data assets, various SaaS applications, and on-premise clouds to ensure your data environment is secure.
Sensitive asset and data posture management help organizations gain visibility and configuration monitoring of data assets while ensuring adequate security settings. Organizations can scope configuration settings based on the sensitivity of data in them. For example, disabling public access data settings is required for data containing confidential information. However, data containing an organization’s website materials should have public access. Also, applying selective security settings based on the data’s sensitivity helps lower cost and management overhead. For example, enabling Cloudtrail or Server access logs broadly on all data is unnecessary and expensive, and the organization may only need it for regulated data for compliance audits.
Dynamic enterprise environments require continuous data discovery scans to ensure regular security posture monitoring and compliance. Sensitive Data Intelligence provides the ability to monitor Sensitive Assets and Data Posture continuously. It also enables auto-remediation to resolve security risks instantly.
As a result of these processes, an organization can automate security and privacy controls. Once an organization has gained visibility into its data security posture, it can discover gaps in its security controls and orchestrate appropriate security controls to fill the gaps.
It becomes challenging for organizations to determine which data poses the most significant security and privacy risks with a data glut. However, for continuous data security and compliance purposes, organizations need to understand the inherent risk of the data. Without a clear understanding of data risks, an organization may misallocate budgets and resources for risk mitigation activities and security controls.
This step provides an executive summary of an organization’s data risks in the form of a data risk graph. It provides a single numerical figure depicting the overall data risk.
This step has the following processes:
The data risk graph provides a numeric risk-centric view of sensitive and personal data in an organization’s environment with a clear breakdown of various risk contributors. This step enables organizations to review data risk at an aggregate global level or a granular level for each data store, location, personal data attribute, or data subject’s residence. In addition, they can track changes in global data risk over time and identify high-risk activities.
The data risk graph has highlighted and ranked order data risk by data assets, locations, owners, and personal data types to enable organizations to prioritize and target security budgets and resources towards high-risk areas. Organizations can also record historical data risk scores to track how risk scores improve or deteriorate over time.
Organizations can customize and configure data risk scores using simple knobs to indicate the sensitivity of various factors. They can also set sensitivity levels based on data types, the location of the data, the residencies of data subjects, and data concentrations.
This step is the final stage of Sensitive Data Intelligence and is paramount in ensuring compliance with global privacy laws. It involves building a People-Data-Graph to map personal data with its correct owners, i.e., customers, users, employees, and other individuals. People-Data-Graph is a graph between an individual and their personal data across all connected systems. It is an easy-to-use conversational interface.
This step has the following processes: