Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Analytics and ECA 9 Use Data Visualizations to Supercharge eDiscovery Management and Practice

Use Data Visualizations to Supercharge eDiscovery Management and Practice

Jun 3, 2021

Document reviews face intense time and cost constraints, all the while data volumes grow, new data types flourish, and communications become more complex. Data visualizations help review teams make fast progress on complicated matters with enormous, varied datasets. Managers can use visualizations to drive crucial decisions, smarter workflows, more efficient classifications and better QC.

CDS experts Chris O’Connor, Cory Logan, Michael Milicevic, Esq. and Sue-Deelia Tang, recently discussed the impact of data visualizations on litigation and investigations on our webinar Seeing is Believing: Why Visualization Matters in eDiscovery?

Read on for Part II of a three-part blog series, where our team provides specific examples and use cases where eDiscovery practitioners can leverage data visualization dashboards.

Data visualizations can help drive key legal and business decisions.

Chris O’Connor:
If you’re in high-level management, a relationship partner, a general counsel of a company, or if you simply want to check on the progress of discovery for litigation, this is a way to do it without having to get into the dirt too much. So from a management perspective, the benefits are pretty significant.

Mike Milicevic:
Visualizations can help attorneys and litigation support professionals of all technical proficiencies make the most of eDiscovery technology with minimal training.

Visualizations can also provide key business insights to corporate legal stakeholders in real time. So, a matter specific billing dashboard built into a review database, for example, can help case teams track costs and provide budget information in real time at a glance as opposed to waiting for invoicing or reviewing static, monthly reporting, multimedia billing dashboards.

You can provide budget insights against a company’s eDiscovery portfolio to help control costs and an attorney work product scoreboard to help track coding decisions and review progress, and make sure production deadlines are met.

A cluster wheel and communications network web can identify internal corporate communication patterns and help identify irregularities in investigations.

I can go on and on, but those are a few examples. The point is that visualizations can play an integral role in analyzing and driving certain business and legal decisions.

Chris O’Connor:
What are you going to get from that information? How will we share it and what’s the immediate value? What can I do with this information that I get at the outset?

Mike Milicevic:
At a 30,000 feet view, I think visualizations will help you to draw high-level conclusions about data sets that would be pretty difficult if not impossible in some cases to fully capture linearly.

For example, you can readily identify outliers like missing date ranges and complete data sets, unusual communication patterns between custodians and unrelated departments.

You can better categorize documents like potentially privileged, or responsive and non-responsive to help expedite review.

You can also identify patterns that may help inform review strategy.

Make decisions on large volumes of data without getting deep in the weeds, doc by doc.

Mike Milicevic:
One example that comes to mind from a recent case: So, let’s say you see a spike in email communication at the end of every month, usually a high count of PDF attachments. Those might be automated notices or invoices going out to clients that you can pull from a review set. That’s a real-life example. I think we pulled 15,000 of those out of a set of 100,000 documents. Visualization can help give you the insight to make those broad decisions on large volumes of data in a much more informed way without having to get so deep in the weeds doc by doc. You may still have to do that, but having an idea of where you’re going early on is a key benefit of visualization.

Chris O’Connor:
Right, organizing the information, in a manner which people can then proceed through the review in a sensible way.

Mike Milicevic:
How can you organize the big picture for an investigation or litigation by leveraging visualizations?

Mike Milicevic:
Obviously privilege is a consideration within that. I think there’s many different ways that you can go about organizing it.

For example, you can create widgets that follow the electronic discovery reference model that’ll help guide the discovery process and workflow. So in the review phase of Midi RM, you might want to set up a visualization that searches and displays categories of PII, for instance, information like phone numbers, social security numbers, which you can then feed into redaction tools like Relativity Redacted RelOne.

And if we’re talking about internal investigations, you might want to just deploy the cluster wheel paired with network communications analysis, maybe a timeline chart and perhaps a sensitive term filter. So you can see who’s talking to who and when they’re having potentially inappropriate conversations. For example, you can use it to help spot bad actors within an organization in employment related investigations.

Mike Milicevic:
I think if we’re talking about the production phase of a matter, you might want a dashboard with widgets displaying review progress, or work product designation so you can track responsiveness rates and begin to put together initial production sets. I think the use cases are pretty vast for visualizations. And I also believe that visualizations are really underutilized considering the value that they can add.

Chris O’Connor:
Are there limits to what we can see, so long as there’s a data point for it?

Mike Milicevic:
I would say that there are. But I think there are far fewer limitations to visualizations today with the power of Relativity One and Aero UI in terms of discovery than there were in the past. If you can dream it or if you can display it in a static format in a report, we can probably find a way to build it into Relativity and make it dynamic and interactive to really bring the data to life in useful and intuitive ways.

Use visualizations to reveal patterns and gaps in your dataset, without being a technologist.

Chris O’Connor:
Excellent. So, I’m going to transition to Sue Tang. If I’m not a data science technologist, how can I leverage the visualization to accomplish my goals for a particular matter?

Sue Tang:
I’m a technologist and I can tell you that in a server environment, it’s going to take a lot of time to build out some of these complex searches if for example you’re talking about: “I want this phrase within three of this phrase, and I want it only between this timeframe and sent between these two people,” it’s going to take time to build out that search. And depending on the tool, you might need to learn different syntaxes.

Visualization allows the teams to quickly and easily click on predefined and prebuilt graphs, charts, and tables, allowing you to identify those key patterns, easily categorize your datasets and quickly make mass decisions on large volumes of data.

So, imagine logging into a platform and within seconds you’re able to see the date range of all the documents within your dataset. You can immediately identify communication patterns between individuals, and quickly see terms within the documents that can be conceptually grouped together with other terms.

Using the Enron data as an example, say we wanted to filter our data to only show communications during the date range of 1999 to 2001. You can just go onto your timeline, and just click on the timeline and filter it by 1999 to 2001. And then say we only wanted to see communications between Jeff Skilling and Ken Lay within that time frame. Now we’re looking at all the communications between 1999 and 2001 between Jeff Skilling and Ken Lay. Now within the cluster wheel, you’re going to be able to see multiple references or groups of words with the word Raptor but two different color schemes. One, the word Raptor has terms grouped with it like basketball or NBA, referring to the Toronto Raptors. Another one, you’ll see Raptors together with terms like entity or fees, things like that, because Raptor was a special business entity where they hid losses.

So right away, you can mass tag a bunch of non-responsive communications, any communications between those individuals within that timeframe that are referencing the Toronto Raptors. It takes three to four mouse clicks to get to that point, whereas without visualization, it would take a lot longer.

Chris O’Connor:
Information that took the other DOJ investigators quite a while to find. It’s not a knock at the DOJ, it was just the technology at the time.

Sue Tang:
At that time. Yes. And so this same workflow can be used to identify other patterns and gaps within your dataset, and you don’t have to be a technologist to do it.

Drill down into key communications in a few clicks, not a few hundred docs.

Chris O’Connor:
Let’s talk a little bit more about the content that you can gain out of clustering and communication analysis at the outset. I think they’ve done a nice job over at Relativity with the wheel and the bubble thing. But what benefits do we see up front? Outliers are going to be the things to look at first, right? If there’s small bubbles floating way up by themselves and only one person communicating with them, it could be someone’s mom, it could be someone’s cousin, whatever “let’s have dinner,” “happy birthday kind of stuff,” or it could be a Gmail address they’re floating their protected IP into. Where do you start?

Sue Tang:
Right away, you can see that the larger bubbles are where most communications are occurring. So, sometimes you want to just focus on that – clicking on one of those would then tell you who that person is mostly communicating with.

Then looking at the cluster, it’ll tell you what types of communications are occurring between those individuals.. So, that’s the benefit, just being able to look at it and start drilling down right away.

You can see who’s communicating with who, what they’re talking about, and then if you add other widgets along with these widgets on a single dashboard, you can also zoom in on other things like timeline, whether or not those documents have been reviewed and who they were reviewed by and how they were tagged.

Were people tagging most of those communications between those individuals within this topic as privileged? Were they tagging them as responsive? And then it just really allows the team to have an overall picture of the entire dataset quickly and easily just by using these visualizations.

How we approach the data really depends on the case and what they’re looking for. Are they doing this for litigation? Are they doing this for an internal investigation?

So, if it’s an internal investigation, maybe you want to group this together with some sensitive terms and see whether or not there’s any communications involving some of the terms. Maybe you want to see if there is any PII being sent to and from certain individuals that shouldn’t be.

And then of course, as soon as you see communications with people that have conflicting business interests, you definitely want to zoom in on that and flag that and see why they’re communicating and what they’re communicating about.

In addition, you can see gaps in communications where people have switched from email to mobile chat or text messaging. So then, you’re going to say all of a sudden within this timeframe, these two individuals went offline and started texting with each other. I want to see what those text messages are about. So, that’s something else that we can do here.

Read Part 3 of the blog series: Visualization in eDiscovery: CDS Vision Synthesizes Proprietary Workflows and Custom Analytics.

About the Author

CDS Staff

Our leadership team and advisory consultants, project managers, and technical experts assist clients through all phases of the eDiscovery process.

There are no upcoming events at this time

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	This cookie is set by LinkedIn and used for routing.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
oktgid	1 year	This cookie is used for storing the visitor ID of the user who clicked on an okt.to link.
oktsid		This cookie is used for storing the session ID of the user who clicked on an okt.to link.
pardot	past	The cookie is set when the visitor is logged in as a Pardot user.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_dc_gtm_UA-109542572-2	1 minute	No description
_hjAbsoluteSessionInProgress	30 minutes	No description
_hjid	1 year	This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	2 minutes	No description
_hjTLDTest	session	No description
AnalyticsSyncHistory	1 month	No description
CONSENT	16 years 8 months 26 days 9 hours 2 minutes	No description
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Our Insights

Thought Leadership and Industry Trends

Use Data Visualizations to Supercharge eDiscovery Management and Practice

Data visualizations can help drive key legal and business decisions.

Make decisions on large volumes of data without getting deep in the weeds, doc by doc.

Use visualizations to reveal patterns and gaps in your dataset, without being a technologist.

Drill down into key communications in a few clicks, not a few hundred docs.

CDS Staff

Our Blog

Sign Up for Our Newsletter

About CDS

Contact Us