Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Insights - Advisory Services 9 What is a Document? With Chat data, the next frontier of eDiscovery, there’s no simple answer 

What is a Document? With Chat data, the next frontier of eDiscovery, there’s no simple answer 

Sep 22, 2022


Document review in eDiscovery can be a time-consuming, resource-intensive, costly exercise. As the notion of what constitutes a document expands to include new data formats, eDiscovery practitioners need to get up to speed with the next generation of business communications, and how to best prepare for and approach this short message data in document reviews.  

Documents: A Primer

Let’s start with the basics. A document is a permanent record of information – digital or non-digital – that a user can retrieve later. Before computers, documents were typically handwritten or generated on typewriters and commonly only included text.

Since the advent of computers, paper documents have been eclipsed by electronic documents,  to the extent that the average person creates 1.7 MB of data every second and more than 2.5 quintillion bytes of electronically stored information (ESI) every day. This data is also more complex than paper documents, containing tables, images, videos, icons, links, and other elements. Some types of digital documents include:

  • Emails
  • Chat and mobile data
  • Instant messages
  • Social media posts
  • Website content
  • Word documents and spreadsheets
  • Video, audio, and image files
  • Computer databases
  • Data from apps
  • Logs and audit history 

The Changing Scope of Document Review in eDiscovery 

Email and business documents have been discovery fodder for many years, and eDiscovery tools and legal teams are well versed in the collection, processing, and review of these formats. In recent years, however, chat and collaboration platforms like Microsoft Teams, Slack, WhatsApp, and many more have become popular tools for sharing ideas and information. With the COVID-19 pandemic and the global workplace transformation, the business appeal and adoption of these platforms grew exponentially. 

The next frontier for eDiscovery practitioners is short message data from chat and collaboration tools, social media messaging apps, ticketing systems, and mobile devices. Legal teams, outside counsel, and government agencies are handling considerably more volumes of this potentially discoverable information every day, and these formats are evolving rapidly.  

What Constitutes a Chat Document?

When dealing with short message data – what we’re calling chat data, for short – there is no easy answer to the question “What is a document?” Chat is not only a new data type, but it can be very, very large, encompassing millions of messages and years’ worth of data. Clearly, you can’t put a four-year chat room into one document within a review platform. Legal teams quickly find themselves asking questions like:

  • Are you required to disclose an entire chat that extended over a year, even if only part of it is relevant? 
  • Do you have to disclose every chat within a relevant section? 
  • Can you exclude certain messages that aren’t relevant? 

Rules and best practices for handling chat data in discovery are still in flux, so legal teams seeking direction will likely run into different opinions, some more well-informed than others. To eDiscovery practitioners about to tackle chat messages, it’s important to understand that different technology solutions and review platforms allow chat messages to be handled in various ways. Prioritize finding a solution that’s fit for purpose and a team that understands your specific needs for identifying, collecting, processing, and reviewing your data. 

(Full disclosure: CDS Convert 3.0, our proprietary solution, provides end-to-end support for more than 23 data formats; built for the Relativity Short Message Format (RSMF), and now platform agnostic. Click here to learn more.)

The Benefits of Unitizing Chat Data 

When chat threads – a real-time mix of personal and business chat – are produced in litigation, they may need to be redacted for privilege, confidentiality, or personally identifiable information (PII). Since longer strings of chats can create significant review and expense considerations, weighing the implications of time, and cost of redaction is extremely critical. 

How data is unitized can make a significant difference in a review in terms of cost, time, and efficiency. For example:

  • The longer the document, the more time it will be spend reviewing the document, when only a small section may be relevant
  • When a conversation is cut off halfway through, a reviewer may need to review parts of the conversation which come before or after the current document
  • The more a conversation is split up, the less redaction necessary when the conversation contains a significant amount of non-disclosable material. 
  • Pay close attention to keywords. If a thread is split into smaller sections this may affect your keyword logic where within/AND operators may be affected. 
  • If using TAR, the size of documents can affect the quality of your index. Too large and concepts may be lost; too small and you may not get an adequate amount of data for analysis.

Partnering with a provider that has significant expertise in short message data can overcome most of the issues listed above. In addition, using the right technology, such as the Relativity Short Message Format alongside RelativityOne, can overcome some of the production issues. Large redaction exercises can be circumvented by splitting documents on the fly, allowing review teams to create documents within the review platform which only contain the information required for production. 

Production Requirements for Chat and Mobile Data

Just as there’s no standardized way to split chat threads, there are no set requirements for producing chat and mobile data. While some regulators work with production requirements akin to email, others utilize seemingly outdated production specs. Different regulators might request different types of unitization, from one message per document, with metadata for each message, to unitization by day, month or with the entire channel as a single document. 

To best prepare: consider a variety of options and approaches; consult with specialists if necessary; propose the scope, review method and production format; and have an upfront conversation on every single matter. 


About the Author

<a href="https://cdslegal.com/team/mark-anderson/" target="_blank">Mark Anderson</a>

Mark Anderson

In his role as Director of UK Operations for CDS, Mark Anderson provides project management and expert consulting through all stages of eDisclosure and eDiscovery. Mark works alongside corporate and law firm clients to identify data for collection and advises on best practices for collection of data, data processing, and document review workflows. He has supervised multi-national teams and has experience working on some of the largest, most challenging matters, including cases involving cross-border issues and the application of technology assisted review (TAR). Prior to joining CDS, Mark conducted forensic collections, assisted with data investigations, and served as a project team lead for multiple international legal technology service providers. Mark holds multiple Relativity certifications including Relativity Master and is an Encase Certified Examiner.

Relativity AI Bootcamp: Atlanta

Relativity is kicking off a third season of AI Bootcamps on April 23-24 in Atlanta, where CDS’ Director of Advanced Analytics & Data Privacy Danny Diette will be a featured panelist.

Find out more

7th Annual Putting Insights into Practice Forum

Navigate a virtual journey through today’s biggest legal data management challenges at PIIP 2024: ADVENTURES ON THE DATA CONTINUUM

Find out more

Sign Up for Our Newsletter