Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Data Visualization 9 Managing Slack Data Volumes in eDiscovery

Managing Slack Data Volumes in eDiscovery

Sep 15, 2024

Released in 2013, Slack rapidly became one of the most popular business management applications worldwide. Currently, 1 million organizations – including 77% of the Fortune 100 – use Slack, with many users in the banking, crypto, and technology sectors. For nearly a decade, CDS has managed Slack data in client matters. As we’ve worked on more and more Slack cases, trying different applications and formats, our data specialists recognized we needed a robust tool to deal with the nuances of Slack exports.  

We designed our own tool, CDS Convert, to manage the complexities of short message data types. Fully integrated with Relativity Short Message Format (RSMF), the tool would ensure accuracy, defensibility, and meet production obligations. When we launched CDS Convert in late 2019, Slack was the first supported application. 

The Importance of Metadata Identification and Production 

A key focus in developing CDS Convert was expanding the RSMF capacity for identification and production of metadata. RSMF 1.0 supported a handful of metadata fields; we decided to build a more robust set of search options to support more complex eDiscovery. 

CDS Convert supports more than 40 metadata fields, as compared to the 5 default fields currently available in Relativity. Initially designed to meet production obligations from regulators such as the SEC and DOJ, the fields grew to support Early Case Assessment (ECA) via CDS Convert’s pre-conversion reporting.  

Multiple charts and graphs from the CDS Vision dashboard that show short message data in collection by room name, data type, channel, date and more.

Screenshot from CDS Vision

Using CDS Convert’s user-friendly reporting, legal teams can examine data by chat room names, topics, descriptions, message counts, date ranges and participants by channel. For more matter insights within Relativity, CDS Vision offers dozens of ways to visualize short message and mobile data by leveraging the same metadata fields for deeper analysis.

Managing Slack Attachments to Reduce Data Volumes 

When exporting Slack data, the system does not include attachments. Instead, it provides links to Slack storage. Most collection tools will automatically download attachments during export. The process is complicated by the volume of data and how it’s created:

  • Data Expansion: Once attachments are downloaded and embedded, a 100 GB export may increase to more than 1 TB, much of which may be irrelevant to the matter. Read our recent blog which explored the evolving case law around hyperlinked files. 
  • Message Volume: Slack cases routinely exceed 10 million messages, due to the lack of filtering available on export. Many cases contain anywhere from 50 million to 200 million messages. Sometimes, due to the use of bots, a single chat room might contain a substantial percentage of automated messages. In one of our cases, 48% of the data – 30 million messages – was in a single room almost entirely comprised of bot-generated data. 
  • Duplicate Messages: Key custodians will commonly interact within the same channels. Slack Enterprise-level exports, which allow custodial exports, can help reduce overall data volume, but can result in large numbers of repeat messages and inflation of data volumes due to repeat attachments. Whether an export has 100 or 2 million duplicate messages, a conversion solution needs to be able to de-duplicate, since de-duplication will not take place during processing. 
  • Filtering: Slack only provides filtering by date range, unless the organization is on an Enterprise licensing plan.  Most conversion solutions provide an all-or-nothing approach: all the data is downloaded, converted, then hosted on review platforms, incurring processing and hosting costs before any filtering begins.  

CDS Convert’s Early Case Assessment (ECA) tool alleviates these issues. Detailed reports allow users to analyze and reduce data before incurring review platform costs. By isolating by room names, the active contributors to a conversation, those sitting passively, channel topics or descriptions, the dates of the rooms, and the message volume, users can cull significant volume of data from their review to achieve substantial downstream savings.  

How ECA Filtering in CDS Convert Reduces Costs  

A client came to CDS with a 5 GB zip file of Slack JSONs requiring review for production to a regulator. After extraction, the data expanded to 70 GB, which we then parsed using CDS Convert to reveal over 1.3 TBs of data comprised of 80 million messages with attachments. 

Case Example – Impact of CDS Convert on Data Volumes

Comparatively, with other short message data solutions, all 1.3 TBs would be processed within a review platform and incur costs before applying a single filter.  

Using CDS Convert, the legal team excluded irrelevant rooms, focused on the key individuals, and filtered data down to just 50 GB required for review – a reduction of approximately 96%, likely saving tens of thousands of dollars. 

Next: Defensible Approaches to Converting Short Message Data  

Legal teams are right to be concerned about Slack data volumes and their impact on a matter’s scope and cost. However, if data conversion is not done in a defensible, proven manner, key information could be missed, creating major complications and jeopardizing discovery.   

Click here to read part twoReviewing Slack in Context – where we discuss reviewing Slack data and how to capture the platform’s many data points in a reviewable format.

About the Author

<a href="https://cdslegal.com/team/mark-anderson/" target="_blank">Mark Anderson</a>

Mark Anderson

As Managing Director, EMEA, Mark Anderson provides project management and expert consulting through all stages of eDisclosure and eDiscovery. Mark also leads the development of CDS Convert, a proprietary tool which analyzes short message data from more than 35 data sources and makes it easy to review in popular eDiscovery platforms. He has supervised multinational teams on large, complex cross-border matters. He holds multiple Relativity certifications including Relativity Master and is an Encase Certified Examiner.