Our Insights

Thought Leadership and Industry Trends

Home 9 Insights 9 Tips for Improving Your eDiscovery Search Queries

Tips for Improving Your eDiscovery Search Queries

Nov 20, 2019

The need to run complex and constantly evolving search queries is a reality in some eDiscovery matters. The challenge is to determine the right search terms that will find relevant documents as effectively and accurately as possible. When dealing with large amounts of data, small changes in search terms can lead to substantial changes in the size of the review population. Inaccurate search terms can also result in reviewing the incorrect documents or missing relevant data. While there is no perfect process and the parties may need to go beyond using search terms to find relevant documents, implementing a few tips and tricks when using search terms can make a big difference in the time and money spent on review.

Parsing search terms. Parsing long search strings into more reasonable terms provides a couple of benefits. With individual terms or short strings, it is much easier to see what parts of a search term are driving the hit count. This helps determine what terms can be adjusted to reduce the population of data to be reviewed, and what terms may be overbroad. It may also be necessary to parse terms to fit under Relativity’s 450-character limit for each search term.

For example, if you have a long string of terms combined with “OR” operators, those can be broken up into individual terms: (apple OR orange OR grape w/3 juice) AND breakfast can be parsed into:

(apple w/3 juice) AND breakfast
(orange w/3 juice) AND breakfast
(grape w/3 juice) AND breakfast

As the terms get longer, each part of the term would be parsed depending on the parenthesis and operators involved.

Noise words. Relativity has standard noise words in the dtSearch index, which are words that are not indexed by default. It is extremely important to check search terms for anything on the noise word list. It may be necessary to adjust the existing index or create a new index in order to achieve accurate results.

Search term logic. Verifying that terms are running correctly is a critical part of the work! Be sure to sample complex terms and ensure proximities (w/3), regular expressions or other searching tools are working as expected. When running variations on terms with small changes, it can be helpful to compare the results to ensure the changes are adding or removing the documents that you would expect.

Quality control. When the search terms report is completed, the work is not finished! QCing the results is an important part of the process. Terms resulting in errors are obvious issues, but search terms with very high or zero hit counts are another red flag. If a term is very broad, a high hit count may be expected, but it could also be the result of a noise word issue. Terms with zero hits may require a second look to ensure the logic accurately captures what is attempting to be searched.

Effective use of search queries can significantly reduce review time. As a result, best practice is to consult an experienced eDiscovery service provider who can offer guidance and technology tools to maximize your searches.

Contact the CDS Advisory Services team to discuss how we can improve the results of your next eDiscovery project.

For more tips to help cull your data, see Best Practices for Culling Your Data to Save Time and Money.

About the Author

Brian Zimmermann, Esq.

As a Client Director, Brian Zimmermann manages workflows and provides high-level expertise to his clients. He has over 7 years of eDiscovery and legal experience and has consulted on multiple large matters for a variety of clients, including Am Law 100 firms, global financial services organizations and multi-national pharmaceutical companies. In his time with CDS, he has managed challenging eDiscovery matters including second requests, significant multi-district litigation and large class action cases. Brian is a licensed attorney in the state of Illinois.

17 April 2024

Women in eDiscovery: San Diego Chapter Technology Bootcamp

CDS is a proud sponsor of Women in eDiscovery's upcoming San Diego Chapter Technology Bootcamp, taking place on Wednesday, April 17, at Sheppard Mullin Richter & Hampton LLP in San Diego, CA.

Find out more

23 April 2024

Relativity AI Bootcamp: Atlanta

Relativity is kicking off a third season of AI Bootcamps on April 23-24 in Atlanta, where CDS’ Director of Advanced Analytics & Data Privacy Danny Diette will be a featured panelist.

Find out more

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	This cookie is set by LinkedIn and used for routing.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
oktgid	1 year	This cookie is used for storing the visitor ID of the user who clicked on an okt.to link.
oktsid		This cookie is used for storing the session ID of the user who clicked on an okt.to link.
pardot	past	The cookie is set when the visitor is logged in as a Pardot user.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_dc_gtm_UA-109542572-2	1 minute	No description
_hjAbsoluteSessionInProgress	30 minutes	No description
_hjid	1 year	This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	2 minutes	No description
_hjTLDTest	session	No description
AnalyticsSyncHistory	1 month	No description
CONSENT	16 years 8 months 26 days 9 hours 2 minutes	No description
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Our Insights

Thought Leadership and Industry Trends

Tips for Improving Your eDiscovery Search Queries

Brian Zimmermann, Esq.

Women in eDiscovery: San Diego Chapter Technology Bootcamp

Relativity AI Bootcamp: Atlanta

Our Blog

Sign Up for Our Newsletter

About CDS

Contact Us