In a recent webinar, “Crate Digging: Finding Relevant Materials in a Universe of Accessible Cloud Backup,” CDS knowledge leaders hosted legal and technology experts to discuss how modern data management and storage strategies, designed for maximum accessibility, are squaring with the increasing pressure to achieve proportionality in the face of massive volumes of ESI.
Read on for a lightly edited transcript of their conversation, Part I of a series. To watch the entire recorded webinar, click here.
Chris O’Connor, Director of eDiscovery Strategy, CDS
William Wallace Belt, Jr., Managing Director, CDS
Lindsey Lanier, Product Management Director, VerQu, A Relativity Company
Pete Lwin, Senior Project Engineer, CDS
John Rabiej, partnering with GW Humphreys Complex Litigation Center
Adam Rogers, Senior Forensic Analyst, CDS
Civil Rule 26: The Baseline for Accessibility
Bill and John, when should practitioners consider whether online archives are subject to discovery?
Let me try to answer that first. And thanks for inviting me to talk about this important, emerging issue.
A little background: I was head of the office that staffed the judicial conference rules committees for 20 years, and I started that during the time that proportionality and accessibility and all these issues were percolating, and rules were amended.
To answer your question, we need to look at three provisions of Civil Rule 26, which were enacted at different times, and which were intended to address different issues. You need to have a good understanding of these three rule provisions as a foundational matter. I’ll go into a little bit of detail.
First, of course, information that is relevant to claims and defenses of a case are subject to discovery and must be preserved, no matter what the sources, so long as getting the information is “proportional” to the needs of the case under Rule 26 B(1).
And then you go to the second provision, sources that are not “reasonably accessible” because they can be accessed only with substantial burden and costs may be accepted under Rule 26 (B) 2(B).
And finally, information that is “unreasonably, cumulative or duplicative” is accepted under Rule 26 (B) 2(C). Now, whether information from archive sources is subject to discovery depends on consideration of all these three rule provisions. And so, other than the black letter law, you need to know some of the background for these rules. So, under B(1), courts since the 1970s have construed the definition of relevant very expansively. And so, when they apply it, it’s very liberal and they err on the side of admitting information to discovery.
Enter ESI: The Journey to Proportionality
In the year 2000, the rules committees began to consider the impact of Electronically Stored Information (ESI). They made a fundamental, far reaching conclusion that the principles that apply to paper discovery equally apply to ESI discovery, and more about that later, because that is a key decision point. The next few years, the volume of ESI increased exponentially. In 2006, concerns were specifically raised about the burdens and costs associated with inaccessible data sources. Now, at that time, what we were looking at were the enormous costs in accessing deleted emails and backup tapes whose sole business purpose was to provide disaster recovery. So, the committee promulgated (B) 2(B) to address inaccessible sources.
The key takeaway here is that why did the committee look at these two sources?
Well, they required different and special steps to access and process the deleted information. The information was fragmented on a disc and backup tapes had their own set of issues. So, it required a set that was different from your daily business purpose. And there were substantial costs and burdens involved. The key point is substantial costs and burdens. That’s what makes it inaccessible from a legal standpoint. ESI kept increasing so that in 2015, the rules committees took up the notion of proportionality, which had been a limitation on discovery, and actually brought it into the definition of discovery.
Information is discoverable if it’s relevant, if the claims and defenses are proportional to the needs of the case. You must take into account six factors which are listed in the rule. Cost and benefits are a key factor, but here I do want to mention that one of the factors, an important factor for your consideration, is the interests that are at stake in the case.
As an example, constitutional issues may be very critical in a case, and they may trump any of the other factors. This is important in employment discrimination cases where constitutional issues are alleged, and this argument is going to be made. So, that’s a very quick summary and the background of these rules and how they have to be applied to archived data sources.
Assessing the ‘Burden’ – Legal, Tech and Other Factors
Speaking of backup tapes, the data source that we’re investigating. How is intent of use vs. the actual manner in which it is used considered under the accessibility rule? What does this look like when responding to a routine discovery matter?
In practice, and this follows directly from John’s comment, we would routinely respond to discovery or start projects that involved the interrogation of backup tapes or archives with a statement – that first clarification – that the backup tapes and archives are for disaster recovery only. That was part of a larger response, which would lead to a statement that we are not accessing backup data. We’re not going to be including data that’s on backup tapes in discovery responses. We routinely gave that response, but not in every case. There were some cases where we were forced to consider data in backup tapes and preserve that data.
Around the late 2000s, you had to buy new backup tapes and they were expensive. They were $1,000 per tape at one point. So, trying to preserve continuously growing data volumes on backup tapes became untenable. And John’s experience on the rules committee really reflects what we were seeing in practice – trying to handle requests for backup tapes in a typical discovery matter.
It’s really not the intent. The intent is almost irrelevant here. It’s the effect. If you’re looking at these backup sources, what is the actual burden? If there are extra steps taken, special steps to get the stuff that incurs additional burdens and costs, that’s what you’d be looking at from a Rule 26 perspective.
Adam, Bill and John: Accessing this data may lead to potentially relevant materials, but how burdensome do they become? We’re going to investigate this a little bit further when we get into the technology, but does the value of these online repositories which potentially hold relevant material, have any sway when it comes to whether or not we have to search them?
It certainly does. There’s a distinction between accessibility from a technology standpoint, and accessibility as a lawyer understands that term.
Accessibility under the rules implies both the technological ability to get to it, access it, pull it and use it. And in the legal sense, it also considers the downstream burdens. If you can access two petabytes, three petabytes, terabytes of data, you’re basically opening the door to an iron mountain facility. That doesn’t necessarily mean it’s discoverable.
If the burden of trying to find relevant data in that data source outweighs the potential relevance of the data. It’s the start of the conversation, it’s really not the final answer to the question.
Cloud Backups: Likely Data Sources and Data Types
Adam, let’s talk about some of the sources of data that we’ve seen in a live scenario. With the availability of online archiving solutions, where are we finding data in the cloud?
Some of those systems are Barracuda, Mimecast, Veritas, Smarsh, and Dropsuite. And that’s just to name several. There are many out there, and it seems like there are more popping up every month with new systems and new requests to pull data out of them.
What kind of data are held in these platforms? Are we seeing live backups where their email system could be Office 365, where it’s only kept a short time period in Office 365? Do these systems become redundant backup sources and not disaster recovery sources, so they’re essentially live access backups?
Correct. And yes, the types of data we’re seeing there are typically email and chat data. We’re seeing Skype, text, Teams, Slack data and the list goes on and on with all those chat apps. We’re also seeing, depending on the system, laptop and desktop images stored there. Email and chat data are typically backed up. As you mentioned before, there could be a 60-day retention policy in place for Office 365 data, so email data would be backed up in a system like Barracuda for archiving purposes.
Where are we seeing challenges in getting information from those systems from a tech and legal standpoint? Are we seeing them due to the legal constraints, or are we seeing a technological issue? Adam, I’ll let you answer with the tech point of view, because I’m going to go ask Bill and John about the legal one.
As Bill mentioned earlier, from a technical standpoint, the data is accessible. As long as we have a login to that system, we can run searches and we can export data. With a system like Office 365, we’re able to export one PST for someone’s mailbox. With systems like Barracuda and Dropsuite and Mimecast, there’s large limitations on exporting any data. Instead of one PST being exported, we have to export 50 to a hundred zip files which are broken up into 300 megabytes, and we then have to extract each PST out of all of those files. It makes the exporting take a lot longer. And then there’s some challenges down the road from there.
John and Bill, does that meet that burden that I was talking about when we’re exporting for one custodian multiple limited sized zip files? I suppose there’s a cost variable that has to be considered if they’re the only source. But if it’s being utilized solely for a disaster recovery or an actual backup system, and it’s not live, would that be something that we can put to the side at the outset?
Well, to find the answer to that question, you go to one of the provisions that I mentioned earlier. And that provision accepts unreasonably cumulative information. Chris, you mentioned the phrase redundant data sources. If it’s a redundant data source, then you tee up the issue for the court that this is cumulative. And then the issue is going to be, is it really cumulative? Is there really something different about this vs. that one? That’s what the issue is going to be as a legal matter. If it’s cumulative, even if it is redundant, then of course that provision then takes precedent and trumps the other ones. Then you don’t really have to go to the proportionality analysis because there is no benefit if it’s redundant.
Now to your point, Chris, those points are important to the court, and to outside counsel. They’re not immediately obvious and not everybody has the technical knowledge that Adam has, or has somebody like Adam on staff that can say, “Well, to get to that data, you’re going to have to pull out from one guy, and you’re going to have to pull out 50, 10 megabyte whatever and create that UST,” and be able to communicate to the court judge, this isn’t just a process where we push one easy button and we get all the data that’s in the archive for that custodian.
Those types of technology facts, technology considerations, are important to pardon. They’re relevant to the determination of how you’re going to proceed under the rule, and they help frame how you might make your argument, although they probably don’t give you the definitive answer to the question.
What are people doing in the marketplace? Are organizations utilizing these tools for simple backup and disaster recovery, or are they also utilizing it for active use systems, perhaps for compliance needs? These systems all advertise themselves as being able to handle that.
We’ve definitely seen both of those scenarios. I would say the majority do support those compliance needs. Recently, in my experience, we’ve had to go to Barracuda to pull active mailboxes for litigation or internal matters. The majority is compliance, but we definitely see disaster recovery systems as well.
Click here to read Part II: Technology, Tools and Techniques for Mining Cloud Archives.