Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Advisory Services 9 Applying Generative AI to Modern eDiscovery Workflows

Applying Generative AI to Modern eDiscovery Workflows

Aug 16, 2024

Technology assisted review (TAR) software has remained much the same over the last decade. Until now. A new wave of generative AI solutions—including Relativity aiR and DISCO’s Cecilia—are coming to market. The hype surrounding these products has been immense, prompting teams to ask: Will generative AI replace TAR? Will it replace human review entirely?  

DOWNLOAD OUR EDISCOVERY AI GLOSSARY

Traditional TAR Workflows “Learn” by Example

Established TAR software uses machine learning to distinguish likely relevant from likely irrelevant documents. The software “learns” the distinction from exemplar documents classified as relevant and irrelevant by attorney reviewers. From these examples, all the remaining unreviewed documents receive rankings (typically on a scale of 0 – 100). The higher the ranking, the more likely a document will be relevant. Depending on client needs, those rankings can be used to create a variety of workflows. The two most common workflows are known as TAR 1.0 and TAR 2.0.

TAR 1.0

The TAR 1.0 workflow uses the TAR rankings as a classifier in large matters like Second Requests, where it would be unfeasible to review all relevant documents before production. Documents that score above a determined threshold, plus any attachments, qualify for review or production (subject to a screen for sensitive information and privilege). Documents below that threshold are discarded. The threshold rank is typically determined by review of a random sample that can be used to estimate which rank will supply a sufficient percentage of the relevant documents, i.e., sufficient “recall”.

TAR 2.0

TAR 2.0 uses TAR rankings to sort documents for review. After the software has generated its initial rankings using a small sample set of reviewed documents, the highest-ranking unreviewed documents are presented to the review team first. As the review progresses and the team identifies additional relevant and irrelevant documents, the TAR software updates its rankings periodically to incorporate the new exemplars, reshuffling the document sort order. By front-loading the review with the most likely relevant documents rather than sorting the documents randomly, teams are able to move into second level review or complete their production requirements much more efficiently. 

Since the most likely relevant documents get reviewed first, the high-ranking documents eventually run out and the review team begins to see predominantly low ranking and likely irrelevant documents. When this occurs clients commonly consider stopping review before the entire set has been completed. This can be defensibly accomplished by reviewing a random “Elusion Sample” from the unreviewed pile to estimate how many relevant documents would elude identification by cutting off the review. 

Generative AI for Document Review

Here’s basically what Generative AI does: A human user enters a prompt to Open AI’s Chat GPT, Microsoft Co-Pilot, Google’s Gemini, or some other interface. The chat interface submits the prompt to a Large Language Model (LLM) which generates a response. The responses have such high quality that it seems like the model understood the prompt, like a human. However, the model processes information much more quickly than a human ever could, and sometimes the responses include errors – termed “hallucinations” – that a human would not make.  

So how can this technology help with document review?

Many eDiscovery platforms have developed integrations with LLMs using a technique known as Retrieval Augmented Generation (RAG). RAG combines a generative model prompt with access to a knowledge base of documents. This combination allows the model to ground its responses in that knowledge base, rather than its original training data. 

In this example using Google Gemini Advanced, AI extracts the information from the source material.

prompt in Gemini asks to see the leadership team list from the cdslegal.com website

Screenshot: Google Gemini Advanced

It is also possible to tell the LLM to classify a document as relevant or irrelevant to a set of instructions.

Prompt in Gemini asks whether CDS supplies ediscovery services

Screenshot: Google Gemini Advanced

Relativity aiR for Review features a much more robust and useful set of outputs than the one-word reply above, making it is possible to:

  • Input a set of instructions detailing the type of content that is relevant to the review. The instructions can be in plain English, in a format that is nearly identical to a protocol written for a review team.
  • Use aiR (via GPT-4o) to classify all the documents in the database on a scale of 0-4, with higher ranks likely being more relevant.

Using aiR Classifications to Drive Other Workflows

As with traditional TAR software, it is possible to use the classifications to drive different workflows. 

TAR 1.0 Replacement

Using aiR as a replacement for relevance review in the same way TAR 1.0 has been applied most of the last decade is the easiest use case for this workflow—The Sedona Conference provides a deeper dive into this framework. In testing, aiR for Review has regularly achieved 85-95% recall, significantly better than the 80-85% typically achieved with traditional TAR software. Although aiR for Review is much faster and less costly than manual review, it is more expensive than traditional TAR 1.0 software. Other potential downsides: it might only be helpful for outgoing production reviews and there is not yet any Da Silva Moore-type court approval of this workflow.

TAR 1.0 vs Gen AI workflow

TAR 1.0 vs Gen AI workflow

TAR 2.0 Replacement

Although aiR for Review generates only four rankings, the rankings can be useful for prioritizing a review in the same way as traditional TAR software. It’s also possible to turn on traditional TAR 2.0 software—such as Review Center—within each rank to more efficiently prioritize each subset. However, because this workflow encompasses review AND the heightened software fees, it will likely be most appealing for smaller data sets with more expensive attorney reviewers. 

TAR 2.0 vs Gen AI Workflow

TAR 2.0 vs. Gen AI Workflow

Issue Coding

One of the biggest benefits of aiR for Review over traditional TAR software is that it is not limited to a binary relevant/not relevant designation. Instead, it can classify for 10 different issues in a single analysis, which could be used after review to help with fact discovery and deposition preparation.

Gen AI Offers an Exciting Assortment of Review Options

Generative AI has expanded the already bountiful array of review options. While this might make the best approach a little more challenging to identify, the benefits of this software will likely be a major win for the industry going forward. As with any new technology, the costs will go down and the quality of the product will go up. Keep checking in with CDS to learn more about the latest advancements in Gen AI.

CDS provides a full range of advisory services related to document review and production. To discuss how we can streamline your organization’s workflow, contact us at  for a consultation today.

About the Author

Dan Diette, Esq.

Dan Diette, Esq.

Daniel Diette is an eDiscovery Data Scientist specializing in Technology Assisted Review (TAR) and eDiscovery Analytics at CDS. He has over 13 years of experience applying analytics to all phases of eDiscovery. As head of CDS' Advisory Services Analytics Team, Daniel manages the TAR process for CDS’ large, complex projects, from Second Requests to multi-billion dollar litigations and investigations. He consults clients on efficient document search and review in Relativity, Brainspace, DISCO, and Reveal, and the defensible use of predictive coding software and workflows.

Relativity Fest 2024

With new challenges in data collection and handling, there's no better place to tackle these issues than Relativity Fest 2024. Taking place September 25-27 in Chicago, this premier event will spotlight CDS as a proud Sapphire Sponsor, committed to driving real results with our expert-driven solutions.

Find out more

Sign Up for Our Newsletter