Three Ways Technology Relieves the Pressure from Expanding Whistleblower Regulations

Three Ways Technology Relieves the Pressure from Expanding Whistleblower Regulations

 TL_Whistleblower
September 5, 2024

Leveraging AI and Analytics to Streamline Compliance Investigations

On August 1, 2024, the US Department of Justice (DOJ) released the Corporate Whistleblower Awareness Pilot Program to incentivize whistleblowers to report corporate misconduct and, in turn, force organizations to strengthen their internal compliance programs and controls. The DOJ also announced an amendment to the Corporate Enforcement and Voluntary Self-Disclosure (VSD) Policy, whereby if a company receives a whistleblower report, it must complete its investigation and self-report to the DOJ within 120 days or lose eligibility for the protections of the VSD Program.

This development follows a global trend of increased whistleblower regulations, including the EU Whistleblower Directive and similar regimes in Japan and Australia. The EU Whistleblower Directive creates particularly difficult burdens for compliance professionals, as a whistleblower complaint must be investigated and reported back to the complainant within just three months.

Completing a compliance investigation in a few months is challenging in the best of circumstances. In the "big data era," when corporate data is not only incredibly voluminous but also spread across a diverse ecosystem of cloud platforms, mobile devices, computers, and traditional servers, these deadlines can seem impossible. This is even more so the case in the anti-bribery and anti-extortion context, where those large data volumes almost always include foreign language documents. Indeed, for many compliance professionals, the only way to meet these deadlines might be to outsource the investigation to a law firm, assuming there is the budget to do so.

Fortunately, advances in investigative technologies can relieve much of the big data pressure. This article discusses three AI and analytics tools that in-house legal and compliance professionals can leverage to materially streamline whistleblower and analogous investigations.

  1. Use Analytics in an Early Case Assessment (ECA) Platform to Isolate Relevant Documents

When conducting an investigation, search terms typically serve as the initial filtering tool. While search terms are helpful, they are almost always both underinclusive and overinclusive. Further, running search terms across a multi-language dataset requires careful consideration of local dialects, colloquialisms, and even verb conjugation. The more effective way to reduce the “haystack” of data into a more manageable subset during an investigation is to leverage analytics through a user-friendly ECA platform. Simple examples of employing analytics to focus your search are discussed and depicted below.

    A. Communication Grids

Using a communication grid, ECA platforms can automatically quantify the volume of communications between all of your data custodians (as shown below). This grid is generated across all data in scope, or specifically targeted to keyword search results, date ranges, and other filtering criteria.  Thus, you can quickly see who was talking about your target subject matter, and in what volumes. You can then access the underlying communications with the click of a button.

Communication Grids

 

    B. Date Histograms

An ECA platform will also present volumes of data over time, enabling you to track and analyze the accumulation and distribution of data during specific periods (as shown below). Date histograms can be generated against the full universe of your in-scope data or targeted to specific keyword search results to quickly spot trends and periods of heightened activity regarding your subject matter of interest.

Simply click on any bar to drill down to daily volumes and to quickly access the underlying files.

Date Histograms

 

    C. File Type Breakdowns

The file type breakdown within an ECA platform (as shown below) likewise enables you to quickly filter your target data universe by file type (e.g., to isolate emails, Excel files, or multimedia). When used in combination with search terms, communication grids, and date filters, ECA platforms for investigations thereby accelerate your access to relevant information.

File Type Breakdowns

 

  1. Deploying Traditional AI to Further Streamline Data Volumes

Beyond analytics, ECA platforms for investigations also streamline compliance matters through “traditional” artificial intelligence tools (i.e., machine learning algorithms that pre-date GPT and have been field-tested for over a decade). Types of traditional AI tools that can supercharge your investigation are detailed below.

    A. Concept Clustering

Concept clustering is an unsupervised machine learning tool that automatically groups similar documents based on shared characteristics (typically focusing on patterns of words or phrases used in the documents), as shown below. In the context of legal and compliance investigations, clustering provides significant advantages by enabling users to rapidly identify and concentrate on relevant sets of documents based on the subject matter discussed, thereby minimizing the time and effort required to sift through extensive volumes of data.

By highlighting key themes or recurring patterns, clustering makes it easier to uncover critical information, establish connections between related items, and streamline the review process. Oftentimes, the investigator will see clusters of documents related to unusual or suspicious activity, such as references to “Project Red,” “university tuition,” or even “Girl Scout cookies,” leading to the discovery of code words or phrases utilized by bad actors to attempt to obfuscate their activities. Concept clustering can be run in conjunction with search terms, communication grids, and other analytics filters to quickly hone in on relevant documents.

Concept Clustering

 

    B. Find More Like This

A "Find More Like This" tool, which is also referred to as near-duplicate detection or similarity searching, is a sophisticated tool that allows users to identify all documents most similar to a selected one. Thus, once a single “hot document” or “smoking gun” is discovered, with the click of a button, all documents with similar content will be isolated and scored based on their level of similarity (with a score of 100 indicating an exact duplicate).

Oftentimes, in an investigation, the user might not know exactly how to search for documents evidencing the alleged misconduct. In such circumstances, creating a “synthetic” document reflecting the information the user hopes to find, and then running similarity searching against that document can be a highly effective strategy. In the context of a whistleblower complaint, the key allegations in the complaint can form the basis of the synthetic document to quickly assess if the complaint has evidentiary support.

    C. Automated Language Identification

Finally, because compliance investigations—in particular in the anti-bribery and anti-extortion context—often involve large volumes of foreign language documents, utilizing an ECA platform for investigations that automatically identifies the language breakdown within the target data source is extremely helpful. Indeed, search terms themselves are only useful if the language(s) of the underlying documents are clear.

Automated Language Identification

 

  1. The Future with GenAI

Generative artificial intelligence (GenAI) is poised to be a true game-changer in streamlining investigations. Without question, the ECA investigation platform of the future will also incorporate GenAI tools and workflows, such as document Q&A, issue spotting, document summarization, and automated chronology creation.

Of course, as with all technological evolutions, GenAI also presents areas for concern that compliance and legal professionals should be aware of. The early days of ChatGPT, for example, were known for uncovering “hallucinations” (i.e., generating statements of supposed fact that simply were not true). Likewise, ethical concerns over the duty of competence, confidentiality, and candor are all impacted by the use of GenAI in the legal sector. Thus, education, training, and thoughtful workflows will be necessary to successfully implement GenAI in compliance investigations.

Blog Info
By Maria Fernanda Gallo, Martin H Audet, Daniel Meyers