Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Adapting eDiscovery Tools for Smarter Post-Data Breach Reviews

Adapting eDiscovery Tools for Smarter Post-Data Breach Reviews

Jul 31, 2018

eDiscovery technology isn’t limited to helping with document reviews in a litigation context. As discussed in a prior post, it can be used to comply with notification requirements in the case of data breaches. Once a company knows it has experienced a data breach, various state laws and in some cases, international law, require them to investigate what specific data was compromised and notify affected individuals. A smart review strategy can make use of eDiscovery analytics software to reduce volume and organize the data for efficient review, so companies can meet their obligations quickly, accurately and cost-effectively. Specific tools that can be employed in such post-data breach reviews include the following:

  • De-duplication. This refers to the process of removing duplicate files from a collection of ESI based on their hash values. If two documents, or a family of documents, in a collection have the same hash value, one of them is removed. Hash value refers to an algorithm that generates a unique value for each document. It is like a digital fingerprint and is used to authenticate documents and to identify duplicate documents.Traditional processing-level de-duplication should always be used in a data breach review, but other analytics options are also worth consideration. For example, near duplicate documents, meaning documents that maintain a high-percentage of similar text, are likely to contain mostly the same Personally Identifiable Information (PII), and can be grouped together for efficiency. These near duplicates will only need to be reviewed once to locate PII and satisfy notification obligations.In addition, email threading can be used to ensure that the same content in an email chain is not reviewed multiple times. Email threading provides the ability to group email conversations together and sort them logically. In that way, only the email in the thread with unique content, such as the last email in a conversation or the last time that an attachment appears is reviewed.
  • Prioritized Review Using Pattern Recognition, Analytics and TAR. The fastest way to move through any document review is to have the review team look at the most likely relevant documents first. Discovery software such as Relativity and Brainspace have tools that can automate this prioritization before a single document has been reviewed. For example, pattern (or regular expression) search in Relativity can be used to find documents with recognizable patterns such as social security numbers, employee identifiers, phone numbers, etc.For a deeper dive, Brainspace uses Natural Language Processing to conduct entity extraction. In addition to simple patterns like social security numbers, entity extraction can identify documents with names, locations, organizations, and more. This can even be used to identify documents with multiple identifiers that meets several states’ threshold for PII.The software-identified documents above can be prioritized initially, and this can be supplemented using Technology Assisted Review (TAR) to prioritize documents for the review team. TAR broadly refers to many methods of technology assistance, including analytics and Predictive Coding, used to organize or expedite reviews. Machine Learning tools such as Continuous Multi-Modal Learning from Brainspace and Active Learning from Relativity can learn from decisions made by the review team, which documents in the remaining set of documents are most likely to contain PII. The highest scoring documents can be continuously prioritized for review until the team runs out of relevant material. At that point, a review of a random sample from the unreviewed documents can be used to determine if further review is merited.

All eDiscovery service providers may not be equipped to help companies with a post-breach audit to identify compromised data. That’s why it’s important to discuss these issues with a provider. CDS provides a full-range of advisory services. To discuss how we can assist you with a data breach, contact us for a consultation.

About the Author

Kate Hutchinson

Kate Hutchinson

As the Director of Marketing, Kate Hutchinson takes a collaborative and creative approach to marketing, working with stakeholders across departments to convey the benefits of CDS’s solutions. She holds a BA from Rutgers University and is a member of the Phi Beta Kappa honor society.

Women in eDiscovery: San Diego Chapter Technology Bootcamp

CDS is a proud sponsor of Women in eDiscovery's upcoming San Diego Chapter Technology Bootcamp, taking place on Wednesday, April 17, at Sheppard Mullin Richter & Hampton LLP in San Diego, CA.

Find out more

Relativity AI Bootcamp: Atlanta

Relativity is kicking off a third season of AI Bootcamps on April 23-24 in Atlanta, where CDS’ Director of Advanced Analytics & Data Privacy Danny Diette will be a featured panelist.

Find out more

Sign Up for Our Newsletter