In our recent webinar, Crate Digging: Finding Relevant Materials in a Universe of Accessible Cloud Backup, experts discussed how the growing accessibility of cloud storage is driving new interpretations of proportionality.
At the heart of their discussion: The GW Proportionality Initiative, a collaboration between judges, eDiscovery technologists, and legal experts (including CDS’ own William Wallace Belt, Jr.) to develop a New Proportionality Framework that provides structure to the proportionality analysis. The Initiative and the resulting work product aim to shed light on the technological questions that underscore eDiscovery costs and burdens.
Chris O’Connor, Director of eDiscovery Strategy, CDS
William Belt, Managing Director, CDS
Lindsey Lanier, Product Management Director, VerQu, A Relativity Company
Pete Lwin, Senior Project Engineer, CDS
John Rabiej, Partnering with GW Humphreys Complex Litigation Center
Adam Rogers, Senior Forensic Analyst, CDS
Creating a Common Vocabulary and Agreed-Upon Approach
Let’s discuss the work you’ve been doing with the GW Proportionality initiative. John, I’ll let you introduce what the team has been about and what you guys put together. You recently had a successful conference on this issue. What is it that we’re getting to with this initiative?
I appreciate the opportunity to talk about it. We’ve been working on this project for over a year. We’ve had 55 experts and lawyers work on this project. It’s really a simple idea.
The problem from our perspective was that when lawyers made the arguments before the judge on cost proportionality, the arguments were very shallow and superficial. They would come in on one side, saying it’s going to cost me $10 million and the other side would say that’s an exaggeration and it will only cost you $50,000. There was a wide gap between them. The judge didn’t have any additional information and typically split the difference.
Here, we’re trying to address the lack of information. What we’ve tried to do is present the key information to a judge so that the judge can better inform that decision. And by doing that, it will help the parties negotiate amongst themselves what is reasonable, what is not reasonable, what is proportional to the needs of the case.
We’ve done what a lot of you guys do already, and it just puts it in an organized format, and it uses a standard vocabulary. Which sounds very simple but right now the language is all over the place, and if people agreed on the language, and agreed on the approach, that makes all the difference in the world. The New Framework asks you to prioritize your custodians. Who’s the most important guy you want to get information from, who’s your least important one. And then it also ranks the burden of getting the information from the data sources. And we identify seven data sources, email, file share, social media, mobile devices, computer, et cetera. And then we just plot them on a heat map. Important custodians of information with inexpensive data sources, i.e. email, they’re in one quadrant of the heat map, and they’re the ones you target initially.
Calculating the Costs of Accessibility
On the top left-corner, priority relevance is high. What you want to key in on are your custodians who are plotted, who are mapped in your upper left-hand corner. Custodians plotted in the lower right-hand corner are your least important custodians and most expensive. You want to get this in front of a judge, in front of lawyers. Who are they going to target first? Well, you target the ones that are least expensive and the most important. Once you get that information, logically, you start going down the line and then you start sampling and you start figuring out what is more important, what is not. So that was one key part of the New Framework.
The other accompanying aspect of the New Framework is a cost database table that actually comes up with a per gigabyte cost for each of these data sources.
We came up with a number based on the collective experience of our experts. For email, it comes out to like $7,000 per gigabyte. And that includes all the costs, from the collection, the processing and the review. How did we come up with that? Well, we tell you that in this cost calculator, it’s quite a detailed estimator. It makes a bunch of assumptions which seem to represent what the experience has been out there.
The big caveat is that obviously in individual circumstances things will vary, companies have different ways of handling data, some of them will have theirs in-house which may cut out some of these steps, but what we’re trying to do is provide a reference point which is missing right now.
A judge, when he gets these arguments from lawyers, has no idea. And a lot of lawyers have no idea what the general costs should be in getting information from a cell phone versus an email. What we’re trying to do is to provide information that better informs everyone.
We’re not making the decision as to what is proportional, what is not proportional. It’s just to provide you information on a granular basis that you can challenge, or that you can use to defend your argument. It’s going to help the judge and it’s going to help the lawyers.
Who Benefits From Transparency
The defense bar will largely embrace this, I assume. Plaintiffs may say, “Well, this is a lot of over assumption. You’re raising the costs and they’re unreasonable. We don’t think it’s going to cost that much.” And as Bill mentioned earlier, some of them may come back later when their wish is granted and say, “Well, somebody backed up a mountain of data on me, and I think it’s unfair.”
Ultimately, the goal here, as you said, is to get pricing in front of the bench so they understand what the approximate costs are and they can make better decisions. I don’t think anyone fights the idea of information. Is there a way for this evaluation process to include the plaintiff’s bar?
Well, actually this will benefit the plaintiffs, particularly inexperienced plaintiff lawyers. The numbers that we come up with – $7,000 per gigabyte, for instance, for email – are not that high, particularly compared to a study from about 10 year ago, so it’s a bit dated, where it ranged from $5,000 per gigabyte to $900,000 per gigabyte. So, this actually helps them.
What the plaintiffs are really concerned about is that this approach overemphasizes the cost-benefit factor, as opposed to the interests at stake. In employment discrimination cases, they believe that when defending a constitutional right, it doesn’t make a difference if they’re just arguing for a $10,000 award. They want $100,000 discovery because it involves a constitutional right. That’s the argument they’re going to be making. This doesn’t address this. We do not comment because it’s too hard to judge what’s important and how much weight to give to a constitutional matter.
The Framework is supposed to be generalized, right?
As an attorney, I’m not going to say, “This has got to fit my case right out the box.” I would plug in my actual numbers and get some benefit out of it. Then I can submit it to the court.
Yeah, Bill’s been working on this cost calculator now for the last three months, so he’s the expert on this.
Developing the Formula
What we’ve done is we’ve started with a collection and gone all the way through pre-processing, processing, then we’ve included some allowances for analytics. But essentially you start with your custodians and your gigabyte volume.
The important point is that there is the ability to put in your own number. We have the people that are putting this calculator together, we’re looking at what’s published on the internet, we’re looking at our own personal experiences, coming up with these costs and sharing it.
There is the hourly cost of a privilege review being one or the per gigabyte charge for pre-processing and processing on an in and out model, collections on an hourly model or per data source model.
We’ve got a lot of rows in there where data can be inserted by the person using the calculator. But to my mind, the most important part of the calculator is laying out that process and associating it to cost numbers so that you can walk through a discovery project and map out essentially some estimates that are based on real facts as to the cost burden of a new discovery project. And you can help the court make real decisions on whether or not that’s proportionate or disproportionate.
Got it. Now judges are no longer faced with “this just seems like it costs a lot.” Essentially what they had before is, “This is going to cost $900,000 and the case is worth $200,000, what am I doing here?” And while we are decreasing costs with cloud and other new technology and best practices, the relative volume of data has gone up exponentially, especially this last year during the pandemic.
This type of information being available to the judges has a big impact. I know some of the former magistrates who were on a conference last week talking about this, and I think that for practitioners, just understanding that this is even available is a big step forward.
I know you haven’t finalized the calculator. (We’ll provide the website link so you can check out and monitor their progress.) But if you don’t know about this, this is also something that can be used against you. If you’re unaware of this calculator, you should be taking a look just from a practitioner’s standpoint, lest somebody show up at court and say, “Well, your honor, here’s the calculator, and look at what it says. It’s totally reasonable. I asked for this stuff.”
The idea is to make the calculator publicly available so that everybody can access it, and we’re all starting with this same footing in making these arguments. We typically find the biggest problems happen when matters move too far down the road without a discussion of the critical facts. We get brought in all the time and then people like Adam and Pete are called in on a project that should have been finished three months ago where the discovery deadline and trial date is coming up and we need to take another look.
The vision that we’re working on right now is to come up with a two pager that we’ll be sending to all the courts and judges that has this heat map and this worksheet right behind it. You just put in the results of the other worksheet which takes the numbers that you come up with in this cost calculator, the bottom line figures, for each data source and for each custodian. Then you plot it on the heat map which will make it convenient.
And then we’ll have as a reference, if your cost is not typical or average, the suggested per gigabyte cost. That means the lawyer will have to explain to the court why it’s very different from the suggested number. At least that’s the thought behind it. If they’re asking for something that’s five times as much, there may be very good reasons for it. There may be very good reasons for why something costs much more to access, but it can be challenged by the other side, and at least it gives something more for the judge and the other party to chew on.
When Archives Move to the Cloud . . .
My last question, to all the panelists: Name one thing you think will change as a result of archives going to cloud.
Well, I think that the proportionality analysis Framework will come right to the forefront, because that analysis requires you to apply the six factors to distinguish the archive sources from the active ones.
There will be a closer and harder look at trying to distinguish between something that is cumulative and non-cumulative, or reasonably cumulative. Is it marginally important? Is it unique? The courts are struggling with what kind of standard to apply right now. And they seem to be using a language materially unique to distinguish information that is unreasonably cumulative.
I think that the message here is pretty clear. The debriefing cost which will be easier to capture with data going to the cloud and even more so with the next generation technology. For customers with their data still sitting on the legacy hardware and the legacy systems, it’s still going to be difficult. It’s still going to be a cost burden. But for us, I think as the data landscape changes and companies continue moving their system of records to the cloud, it gives Relativity a huge opportunity to continue to meet teams where their real needs are.
In a follow on that, I think the nature of discovery requests are going to start including these online systems. When it used to be, “Hey, it’s on a backup tape, it’s inaccessible, we’re not going to use it,” now it is accessible by the technician. Then it’s going to be on the attorneys to argue whether we need to collect it or not, or whether it’s overly burdensome. But from a technician’s standpoint, if the data is in the cloud, it is accessible.
Pete, do you think that these changes are going to include improvements for some of the issues that you encountered over time with data coming out of archive systems?
Definitely. If everything’s going to the cloud base, for us, it’s going to be processing much more reliable data. It’s going to be less time-consuming, less in terms of cost, and everything else.
This is going to really accelerate the need for technology expertise in the legal world. I think we’ve already seen… I was a litigator. I’ve tried cases for 20 years now. I don’t work in a law firm, I work at a technology company, and I love it, by the way.
I think there’s going to be more and more need for technology expertise to make operational decisions on how we’re handling our cases. And then the other aspect of it is, with Lindsey, Pete, and Adam, they are each in their own areas of technology expertise.
I think you’ll see technology expertise become more and more sectioned off into specialized areas, and I think that’s going to only increase the need for technology expertise and the awareness of who you need to ask and what you need to ask for when advising clients to make decisions in a case.
Final Thought: Cybersecurity and the Cloud
One last question: What are the cybersecurity implications of the cloud versus in-house storage solutions? Are they much greater, the same or different?
I like to put it to people this way sometimes. How many cybersecurity consultants work at your company? How many cybersecurity consultants work at Microsoft or Amazon? And if that number is not equal, then I think you need to be having conversations about the club. Because really, storage costs are much lower, the speed variables, reliability and accessing it is much easier than so many on-premises systems were, and I think we’ve gotten to a point now in the last year certainly, of security through obscurity. And I think that you have to rely on a robust security platform. But Adam, Lindsey, you guys are experts, go ahead.
Definitely, Chris. When I think of that backup tape that’s stored in Iron Mountain, no one’s going to be accessing that, that’s for sure. When you have an online system, there’s always a chance that someone can get in.
Yeah, for sure. Relativity is a security-first company. We could probably do an entire webinar on all the industry best practices in place to lock down our platform. We take it very seriously.