What Is Data Loss Prevention (DLP)? Definition, Policy Framework, and Best Practices
Learn what data loss prevention is and how to formulate an effective DLP policy to safeguard your company amidst growing cyberattacks. Click here.
Data loss prevention (DLP) is defined as a set of policies and technologies that together aim to ensure the protection of an organization’s data. In this article, we explain what data loss prevention is, its core concepts, and share best practices for implementation.
Table of Contents
What Is Data Loss Prevention (DLP)?
Data loss prevention (DLP) is defined as a set of policies and technologies that together aim to ensure the protection of an organization’s data. In this article, we explain what data loss prevention is, its core concepts, and share best practices for implementation.
To expand further, data loss prevention is:
- Proactive: It assumes that despite security measures and cybersecurity tools in place, data might become accidentally/inadvertently exposed. For example, introducing the vulnerability as part of a routine application update.
- Technology-enabled: A manual investigation of the enterprise content landscape would take months or even years to complete. Automation lets you scan content assets and data exchange channels regularly to detect anomalies. These anomalies are investigated further to check if they can cause data exposure.
- Holistic: DLP aims to protect companies from both accidental exposure and deliberate attacks. Capital One’s negligence allowed an insider threat to hack systems and sell customer data records to malicious parties. Meanwhile, First American’s breach was entirely accidental. DLP will serve as defense mechanisms against both cyber risk variants.
According to Statistica, in 2019 there were 1506 cyber attacks in the US, exposing about 168 million+ records. Even before the Covid-19 lockdowns which pushed more organizations to digital-only business operations, companies had already started to invest in DLP solutions. The entire market stands at USD 1.4 billion in 2020 and is expected to nearly double in 3 years, projected to be at USD 2.28 billion, a clear indicator of rapid rise in awareness over online security breaches as more companies operate online.
Companies need to map out the channels (or vectors) most frequently targeted by threat actors to strengthen DLP. A vector is any path that an unauthorized party might use to enter your systems. An extremely legacy format like data stored on a floppy disk has minimal attack vectors. But the more advanced – or connected – your systems, the higher will be your vector numbers.
Also Read: What Is a Data Catalog? Definition, Examples, and Best Practices
That’s why enterprises must watch out for these common vectors that could cause data loss:
- Cloud – “At this point (in 2020), cloud adoption is mainstream,” said Sid Nag, research VP at Gartner. Enterprises on the cloud use shared resources that are more open to risk than on-premise servers. This is compounded by the fact that companies aren’t doubling down on cloud security at the same pace as their investment in cloud hosting. Gartner predicts that through 2020, 95% of cloud security failures will be due to a customer’s negligence – this has major repercussions for data loss prevention.
- Websites – Websites are commonly associated with inadvertent data breaches, with confidential information making its way to a public page without proper authorization. Enterprises use a mix of public and private web pages, protected by role-based access, to collaborate and stay productive. DLP mechanisms for your website are essential for secure and uninterrupted interactions.
- Networks – Data loss via networks can happen through Bluetooth, Wi-Fi, or even your internet service provider (ISP). As data moves to and from these different channels, they are vulnerable to exploitation. In May 2020, a group of researchers found a security vulnerability in Bluetooth that could allow a hacker to imitate a remotely paired device. Network-based vulnerabilities like these can go undetected for a long time, which is why DLP is so critical.
- Servers – Servers may be less prone to DLP vectors than the cloud, but they come with their risks. Recently, a threat group successfully implanted a key into the domain controller servers of semiconductor manufacturers that would unlock confidential IP data: source code, chip designs, software development kits, etc. Sophisticated attacks can bypass cybersecurity mechanisms – DLP helps to highlight them before it is too late.
- Chat – Employees increasingly use chat/instant messaging to communicate with colleagues. Often, this extends to non-work channels like WhatsApp, WeChat, or personal Google Hangouts. Chat lets users share links, send files, and exchange raw information, requiring extensive log keeping.
- Mobile and edge devices – You can distinguish three broad types of vectors: the core (servers, the cloud), the perimeter (Bluetooth, Wi-Fi, ISP, SDN), and the edge. The last one covers all devices connected to your network but aren’t part of the enterprise device ecosystem. Trends like BYOD and WFH have increased the use of smartphones, tablets, and personal laptops for work – and this requires revisiting your DLP policies. Currently, 84% of IT leaders find it harder to prevent data loss when employees work from home.
- Email – Email continues to be a common threat, as malicious actors might send emails to gain a response, invite at attachment download, or motivate the recipient to forward it across the company. As per The State of DLP 2020 report by Tessian, nearly 800 misdirected emails are sent every year in large organizations, making it a major concern.
Also Read: Top 10 Data Governance Tools for 2021
Stages of Data Loss Prevention (DLP)
Before we dive into the policies and best practices to intercept and prevent data loss at large, let us first look at the 3 broad stages when DLP can take place:
-
DLP in transit
Data can be routed to an unauthorized platform or insecure storage area when moving from one application to another. A hacker might redirect data in transit outside of the organization and use it for their benefit.
An effective way of preventing this is via encryption – data is encrypted before it can exit an application. It stays encrypted all the way so that a hacker can’t make sense of the data even if they capture it. A decryption engine at the destination translates the data back into a comprehensible format.
-
DLP at rest
Data could rest on the premise (in servers), on the cloud, or endpoint storage mediums like USB. DLP defines business rules to set permissions, allow access, and raise flags depending on the user’s persona, behavior, and channel of access. DLP for data at rest is critical for compliance – as any data you store falls under GDPR, CCPA, and similar regulations for customer data privacy.
-
DLP in use
DLP for data in use could prevent a large number of deliberate attacks from happening today. There is a strict authentication protocol to determine who can use data, from which platform, and for which purpose. This type of DLP involves real-time/ongoing measures to analyze how data is currently being used by a user, application, or endpoint device.
Also Read: What Is Data Governance? Definition, Importance, and Best Practices
Policy Framework for DLP in 2021
It’s a common misconception that DLP is all about technology. To prevent data loss, you need a concrete strategy in place that is only enabled by technology but doesn’t rely on it to gain momentum. Ideally, companies need a three-point data loss prevention policy framework that couples technology intervention with well-defined process flowcharts to take care of all projected DLP scenarios.
1. Planning
The first step is to obtain a high-level understanding of your data landscape. What type of data are you working with? Which regulatory jurisdiction does it fall under? Asking these questions will hint at possible vulnerabilities and how to sort them.
- Data classification and risk assessment
Conduct a thorough data inventory and classify both structured and unstructured datasets. Your categories could be personally identifiable information (both employees and customers), transaction data, corporate data (including IP), etc. Assign a risk score to each category based on the cost of exposure and the possibility of getting exposed.
- Data in motion, rest, and use mapping
Chalk out an end-to-end blueprint of data hotspots and movements channels across the core perimeter and edge. Be careful to look out for hidden applications and previously unmapped correlations so that no stone is left unturned. The blueprint can be refreshed at regular intervals and after an event (e.g., new customer onboarding, a digital transformation program, etc.) to stay up to date.
- Regulatory body collaborations
In a connected world, complex jurisdiction overlap can be difficult to navigate. For example, a company based out of Australia might require GDPR compliance if it caters to even a single EU customer. Work closely with your local regulatory authorities and partner with a legal expert to understand your data obligations.
Also Read: What Is Data Security? Definition, Planning, Policy, and Best Practices
2. Implementation
This next step will translate your high-level understanding to enforceable policies carried out by managers, employees, ex-employees, and third-party partners. Technology plays a crucial role in DLP implementation, reducing manual efforts dramatically.
- Sanitation, redaction, and retiring
This step could prevent the lion’s share of data breaches that make headlines. Data redaction removes sensitive elements from reports and other digital assets before sharing more widely. Sanitation implies anonymizing data before it is used, such as cleaning up test data for application QA.
Finally, ensure that you retire data as per compliance norms. It is a good idea to specify a storage threshold beyond which all data (especially customer records) will be retired.
- Data exchange permissions and controls
Use identity & access management (IAM) to plot various user persona active at your company. Based on this persona and the user’s behavioral patterns, you can grant access to a digital asset, monitor activities, and flag any suspicious behavior.
- Data access logs analysis
All traffic on your websites, interactions with digital assets, and communications between users inside a perimeter are recorded in data logs. DLP technology automates log analysis and highlights any anomaly deserving human investigation. Remember to maintain logs even for authorized access, as these can be useful during compliance audits.
Also Read: What Is Enterprise Data Management (EDM)? Definition, Importance, and Best Practices
3. Maintenance
Without proper maintenance, your data loss prevention policy framework will struggle to go the distance. Investing in maintenance means your implementation measures remain steady (and relevant) years after the original conceptualization.
- Device and user management
A lot of enterprise content is stored on endpoint devices, like a senior leader’s laptop or an HR manager’s smartphone. Robust device management will keep all these platforms equipped with the latest security patch and software upgrades. Make sure to tier any loose ends when you retire any device with storage capabilities – even a printer – and when severing ties with a user.
- Data ecosystem architecture refresh
As a business grows or evolves, it will bring in new systems that can act as fresh vectors for data loss. Keep an eye out for “information sprawl,” where data creeps into unstructured systems like SharePoint files, old document versions, etc. A quarterly architecture refresh will give you an updated blueprint without any gaps.
- Data leakage event response
Sometimes, sensitive data can find its way into the public domain despite the most stringent measures. For such scenarios, you must have a response strategy to mitigate damage to the brand reputation as much as possible. Work with marketing and PR on response policies, with active involvement from regulatory authorities.
Ultimately, the framework will depend on the nature of your organization and where you operate. For example, a healthcare service provider might want to put in an additional step or two during planning – assessing every application individually for vulnerabilities. But no matter which framework you choose, some best practices can make or break its success.
Also Read: Top 8 Big Data Security Best Practices for 2021
Top 8 Best Practices for Implementing Data Loss Prevention (DLP) in 2021
Now that you know the ins and outs of what is DLP and the key policies needed to implement 3 stage DLP, let’s dive into a policy framework for DLP planning, implementation, and maintenance.
Simple DLP best practices can strengthen the impact of your data loss prevention policies and make them easier to enforce. Here’s what we recommend for 2021.
1. Provide security awareness training to employees.
Security awareness training fosters a culture of skepticism, where employees are encouraged to raise mental red flags whenever they spot suspicious behavior. The goal is to “err on the side of caution,” as they say, highlighting any anomalies no matter how harmless they might be.
Some regions make security training mandatory if you employ a particular number of workers or operate an industry that deals with confidential information. For example, the House Bill 3834 of 2019 in Texas requires all state agencies, contractors, and local government entities to undergo exhaustive cybersecurity awareness training.
2. Follow a “least privilege by default” policy.
This policy entails that every user enjoys the lowest level of privilege with which they can operate. An employee can access only those assets relevant to their job role, and no more. For anything else, they would have to explicitly ask for permission and get it vetted by a security officer. “Least privilege by default” applies to external users such as a regulatory body conducting routine audits, a procurement partner, a reseller, and the like.
Also Read: 10 Best Data Loss Prevention (DLP) Tools for 2021
3. Contractualize your data loss prevention policy.
Weave DLP into your contractual terms and SLAs. This establishes a clear line of accountability that can come in handy when investigating an exposure, providing a public response, or determining penalties. For example, the onus for protecting data on the cloud can be shared by the customer and the vendor – terms need to be specified in your SLAs.
Also, every employee contract – whether full-time or consultants – should carry a DLP clause. This will lay down the degree of data access, to which extent the employee can legally share data, the approved services for data access, and coverage terms even beyond the employee’s tenure in office.
4. Use multiple content analysis techniques.
Content analysis forms the crux of DLP technology. It analyzes content to classify it into groups, assess its adherence to security policies, gauge risk, and alert the appropriate stakeholders. An important step is to follow multiple content analysis techniques so that you don’t limit yourself. For example, rule-based analysis scans a document against known patterns like 16-digit credit card numbers, while partial document matching checks if the same data resides in multiple documents.
5. Find and protect IP you didn’t know you had.
Intellectual property (IP) isn’t always clearly documented. This can be a problem for startups and growing companies whose entire business is built on a set of game-changing ideas. IP might pass under the radar, exchanged via chat, stored in screenshots, hidden in call recordings, etc.
An exhaustive content audit focused on IP will reveal such valuable information snippets. You can also leverage context-based classification, using a content’s surrounding context to find IP even if it isn’t explicitly labeled. Moving forward, efficient data loss prevention policies can control the chances of undocumented intellectual property.
Also Read: What Is Threat Modeling? Definition, Process, Examples, and Best Practices
6. Be transparent about any loss or breach.
Here is a good rule of thumb when it comes to compliance: if you don’t report it, someone else will. For both Capital One and First American, the breach was reported by a good Samaritan, which added to the charges of negligence. That’s why your DLP policy framework must include a leakage event response plan, clearly outlining:
- The timeline for the first response
- Response owner and the stakeholder who will take public queries
- A dedicated fund for initial remediation penalties
Once again, collaborating with local regulatory bodies can prove useful and earn goodwill even amidst a possibly adverse event.
7. Keep un-amicable separations to a minimum.
It can be challenging to ensure compliance even while an employee is part of your organization – imagine the complexities when they have exited after an unamicable separation. This is probably why 45% of US employees admit to downloading, saving, or sending work-related documents before leaving or after being dismissed from a job, causing a series of data loss vulnerabilities as per a 2020 Tessian research report. Security teams can work with HR to mediate a separation/severance event, reinforced by pre-existing contractual terms.
8. Monitor, monitor, and monitor.
This can be harder than it sounds. 51% of employees were impeded by security tools and software during their work. As a result, over half of them found a workaround, which meant that data passed outside the approved channels.
As part of your DLP implementation, security teams can analyze the impact on employee productivity and build the requisite workarounds themselves so that the monitoring isn’t hampered. Real-time data exchange monitoring, frequent log analysis, and adequate checks (before an event) will nip data loss in the bud. But make sure that you obtain the requisite consent and contractual agreements at the outset.
Policies for data loss prevention have been around since the late 1970s and early 80s, ever since the rise of information exchange at scale. However, in 2021, DLP takes on a wholly new definition. While DLP technically covers both data loss (a company loses access to historical data without any negative repercussions) as well as malicious threats, the focus is now squarely on privacy, protection, and security. IoT devices, 100% remote working, and web-based collaboration are mainstream trends opening up attack surfaces.
A robust data loss prevention policy framework, driven by sophisticated technology, can build a resilient and data-conscious enterprise.
What measures has your organization implemented for data loss prevention? Tell us on Facebook, LinkedIn, and Twitter. We would love to learn from you!