In earlier posts, we examined how Microsoft 365’s Purview eDiscovery platform handles hyperlinked documents. In this article, we turn to its primary competitor, Google Workspace, to explore its capabilities and limitations in exporting hyperlinked files for eDiscovery and investigation response. Both platforms support locating and exporting linked content, but with hyperlinks, the story is always more complex. Here, we’ll highlight the unique challenges and key considerations for managing Google hyperlinked files in eDiscovery.
How Hyperlinks Work in Google Workspace
To start, it’s useful to outline how hyperlinks are handled in Google Workspace. Google’s data storage framework is considerably different from Microsoft so there are significant differences in where and how file data is stored. Google’s object-oriented and distributed content storage framework allows for easy sharing of links between users. While the content of a given file lives in storage blocks distributed within Google’s storage network, each file contains a unique ID that is used to create a shareable hyperlink. These hyperlinks are our primary focus when looking at preservation and collection methods.
However, unlike M365 content, hyperlinked files are typically only associated with the Google Drive of the file’s owner and not located in separate group or channel storage locations, like Microsoft Teams attachments. This is true regardless of how the hyperlinked file originates, e.g., Gmail, Google Chat, or embedded within another Google document. Below is a brief look at examples of how hyperlinked attachments can originate in a Google Workspace:
- Gmail: hyperlinks to Google Drive documents and Drive folders embedded within custodian and group email messages.
- Google Chat: hyperlinks sent via chat messages in Google Chat.
- Google Docs/Sheets/Slides: hyperlinks to Google Drive documents embedded in other documents as text or objects.
- Google Sites: hyperlinks to Google Drive documents embedded in Google Site pages and content.
- Google Sites: hyperlinks to Google Drive documents embedded in Google Site pages and content
Collection Processes for Exporting Google Hyperlinked Attachments
How can compliance administrators and eDiscovery professionals manage the export and collection of identified Google content while also preserving these hyperlinks and retrieving their linked content? The short answer is that there are several methods available to capture Google data for eDiscovery purposes, both native to Google admins and via third party tools designed specifically for the same purpose. In this blog, we focus primarily on using Google’s compliance platform, Google Vault, to search and export data while ensuring hyperlinked files are collected as well.
Assuming an organization has purchased Vault licensing for its user base, Vault’s administrative console allows for the creation of retention policies and legal holds, as well as the creation of case-specific matters dedicated to targeting the organizations various data sources, Gmail, Drive, Groups, Chat, Calendar, and its most recent addition, Gemini. While its features as an information governance tool are not nearly as robust or extensive as Microsoft’s Purview platform, Vault includes a much more streamlined, purpose-built functionality, specifically designed for the targeted export of an organization’s data.
For hyperlinked attachments, the simplified process starts with identifying data and data sources through searches and search/filter parameters. Once data has been identified and searched, Vault assumes by default that your exports will need to include what it calls “linked Drive files.” This single switch is toggled on by default so be aware that turning this feature off will eliminate hyperlinked Drive files from your entire collection. Aside from that, hit the export button and you are off and running. And that’s it right? Everything will be connected and packaged up nice and neat for discovery review once your search is complete? Not exactly.
Exports with Google Vault
Contextual Integrity
When a Vault export is complete, the data does indeed contain the elusive hyperlinked files we are so worried about. However, this is not the end of the story when it comes to making this data useful in an eDiscovery workflow. As with eDiscovery exports from M365 Purview, exported material from Google Workspace requires substantial processing before it is contextually intact, useful, and ready for review. In the Vault export, hyperlinked files are not embedded as traditional attachments but are resident in a folder structure that is separated from their originating parent communications and files. The linking mechanisms exist within the exported reference files that accompany the stored messages and file content (those “unique IDs” we referenced earlier). In Vault’s case, those files are typically exported as “metadata” and “drive-links” CSV files.
Using these files, compliance admins and legal service providers can re-establish the connections and family relationships required to present exported data sources in full context. This is a required step in any ESI workflow that processes and presents source for upload to hosted applications designed for attorney document review and production.
This all may sound straightforward, but the complexities and critical processes involved in this step combine to make this the most important consideration when managing hyperlinked files in discovery response. It is why many organizations turn to outside service providers experienced in dealing with M365 and Google data. Without the ability to interrogate data exports and rebuild families, the export remains just a set of disconnected messages and files.
Versions
Hyperlinked files, by design, can be edited after being shared as attachments. Compliance administrators must account for this risk of ‘version drift.’ As with M365 Purview, Google continues to evolve its own ability to export contemporaneous or ‘version as shared’ hyperlinked files, but at the moment, Vault has no way to provide that functionality to its user base. There are workaround functionalities to isolate the ‘last version’ of a Drive document before it was shared as a link. However, these typically require more involved workflows that leverage third party collection utilities in separated workflows. For the moment, Vault users exporting Gmail and Chat links must work with exports that include and provide linking details for the latest version of a hyperlinked attachment.
Data Volume
It may seem obvious, but adding hyperlinked files as a collection target can significantly increase the size of your data collection, simply based on how often an organization’s users are sharing links. Unfortunately, there are no advanced analysis features available in Google Vault (or Microsoft Purview) that can provide compliance admins with detailed reporting on the volume impact of including hyperlinked files in a source collection. This considerable limitation prevents us from accurately estimating and understanding collection sizes early in the workflow, before running an export.
In addition, there are some collection tools that can be configured to automatically capture all versions of any Google hyperlinked document. Since hyperlinked documents can have widespread user permissions and many simultaneous editors, the number of available versions of the same stored document can be surprising, and ultimately present volume issues with any Google collection. Further, as Google is able to embed not just links to files, but links to entire Drive folders as well, make sure you understand the impact of using collection tools that include features designed to capture all data in linked folders. Users can store thousands of documents in a single Drive folder, and when they share folder links, the resulting collection can dramatically increase total data volume.
Key Takeaways
Our eDiscovery industry is clearly moving past the debate about whether there is even an option to identify and include hyperlinked attachments in a discovery response, at least as it relates to our main productivity apps. Whether it is from third party tools or native app capabilities, hyperlink export functionality is readily available, and hyperlinked data can be readily pulled from both Microsoft 365 and Google Workspace. That does not mean it is easy or that is all there is to it. As with Microsoft Purview, exporting hyperlinked files from Google Workspace can be a multi-step process requiring careful workflows and detailed procedures, particularly when addressing the problem of keeping document families intact. There are still other considerations to address as well, beyond just the more popular questions about versions and version control.
However, with adequate preparation and planning and the support of service providers and expert partners with a thorough understanding of the capabilities (and limitations) of applications like Google Workspace, organizations and their eDiscovery teams can effectively collect and manage hyperlinked files in a manner that is defensible, review-ready and compliant with legal and discovery obligations.
For more information on how best to handle hyperlinked files in Google Workspace, contact CDS Advisory Services and CDS Digital Forensics at .

