The need to run complex and constantly evolving search queries is a reality in some eDiscovery matters. The challenge is to determine the right search terms that will find relevant documents as effectively and accurately as possible. When dealing with large amounts of data, small changes in search terms can lead to substantial changes in the size of the review population. Inaccurate search terms can also result in reviewing the incorrect documents or missing relevant data. While there is no perfect process and the parties may need to go beyond using search terms to find relevant documents, implementing a few tips and tricks when using search terms can make a big difference in the time and money spent on review.
Parsing search terms. Parsing long search strings into more reasonable terms provides a couple of benefits. With individual terms or short strings, it is much easier to see what parts of a search term are driving the hit count. This helps determine what terms can be adjusted to reduce the population of data to be reviewed, and what terms may be overbroad. It may also be necessary to parse terms to fit under Relativity’s 450-character limit for each search term.
For example, if you have a long string of terms combined with “OR” operators, those can be broken up into individual terms: (apple OR orange OR grape w/3 juice) AND breakfast can be parsed into:
- (apple w/3 juice) AND breakfast
- (orange w/3 juice) AND breakfast
- (grape w/3 juice) AND breakfast
As the terms get longer, each part of the term would be parsed depending on the parenthesis and operators involved.
Noise words. Relativity has standard noise words in the dtSearch index, which are words that are not indexed by default. It is extremely important to check search terms for anything on the noise word list. It may be necessary to adjust the existing index or create a new index in order to achieve accurate results.
Search term logic. Verifying that terms are running correctly is a critical part of the work! Be sure to sample complex terms and ensure proximities (w/3), regular expressions or other searching tools are working as expected. When running variations on terms with small changes, it can be helpful to compare the results to ensure the changes are adding or removing the documents that you would expect.
Quality control. When the search terms report is completed, the work is not finished! QCing the results is an important part of the process. Terms resulting in errors are obvious issues, but search terms with very high or zero hit counts are another red flag. If a term is very broad, a high hit count may be expected, but it could also be the result of a noise word issue. Terms with zero hits may require a second look to ensure the logic accurately captures what is attempting to be searched.
Effective use of search queries can significantly reduce review time. As a result, best practice is to consult an experienced eDiscovery service provider who can offer guidance and technology tools to maximize your searches.
Contact the CDS Advisory Services team to discuss how we can improve the results of your next eDiscovery project.
For more tips to help cull your data, see Best Practices for Culling Your Data to Save Time and Money.