Am Law 200 Firm Sets a New Standard for AI Performance in High-Stakes Insurance Litigation
by Justin Smith
Key Takeaways
Using generative AI to review nearly 600,000 documents in an insurance matter, this Am Law 200 law firm reported impacts including:
Precision: 93% precision rate in the validated responsive sample.
Recall: 93% recall rate for the Responsive No category and 60% recall rate for the Responsive Soft No category.
Time Savings: The review timeline went from six weeks to less than two weeks.
In the modern legal landscape, the most formidable opponent often isn’t the opposing counsel—it’s the sheer amount of the data itself. As discovery evolves from a manageable stream into a digital deluge, the traditional linear review models that once served as the industry standard are being pushed to their breaking point.
For today’s litigation teams, the ability to harness generative AI to navigate this sea of information with precision hasn't just become an advantage, but the new benchmark for excellence.
When faced with a case involving a complex insurance coverage dispute, an Am Law 200 firm saw an opportunity to save their client both time and money by leveraging generative AI, and primarily Everlaw’s Coding Suggestions tool, in the document review process.
According to their team, Coding Suggestions achieved a 93% precision and recall rate across the nearly 600,000-document set. This gave them the statistical confidence to move through the review process without the constant burden of sorting through irrelevant documents. The high level of accuracy ensured that attorneys weren't wasting valuable hours on non-responsive data, allowing them to complete the entire review process in less than two weeks—performing over 66% faster than traditional managed review models at just one-fifth of the cost.
“We’ve found that AI not only saves time, but also sharpens our responses,” a Director of Litigation Technology and Support at the firm said. “It's going to save clients thousands of dollars, and in larger cases, hundreds of thousands of dollars. It also improves our confidence as attorneys because we’re capturing everything that is going to be important to the case.”
The Challenge: A Whirlwind of Data
The matter at the center of this innovation began with a staggering initial pool of roughly 600,000 documents. To make the task manageable, the team employed a rigorous culling process, utilizing search terms and Early Case Assessment tools to narrow the field down to about 70,000 documents.
Included in that set were documents like technical engineering reports, contractor communications, and site access logs that were buried, and the team needed to find them fast to meet their obligations and build their defense.
"We were initially behind the eight ball in the sense that we weren't sure if we were going to settle the case," the Director of Litigation Technology and Support explained. "We didn’t want to dedicate a bunch of resources if we were just going to settle, so we pushed it down the road as long as we could until we finally said, 'We have to actually review these documents and we have to do it fast.'"
Building a Defensible AI Workflow
Rather than delegating the project to an external vendor, the firm chose to review the documents themselves in an effort to ensure defensibility and avoid costly contract review hours. This approach allowed the team to provide a clear audit trail regarding how documents were produced, the precision and recall of the tools, and the logic behind every prompt iteration.
“We ingested everything into ECA, and then from there we ran search terms and parsed it down,” an Ediscovery and Litigation Support Coordinator at the firm said. “That's how we got our review population down to about 70,000 documents. Then we utilized the AI and started prompting and running different iterations from there.”
The team utilized a multi-stage pilot program using Everlaw’s Coding Suggestions tool to refine their approach before implementing it at a wider scale.
Coding Suggestions expedites the coding phase of ediscovery by analyzing thousands of documents against instructions and criteria provided by users in natural language. It allows you to evaluate documents against your specific criteria and offers ranked recommendations from high to low confidence. Coding Suggestions provides suggested codes ranked “Yes,” “Soft Yes,” “No,” or “Soft No,” with written explanations for each coding recommendation.
The firm implemented a measured approach to using Coding Suggestions during the review process, starting with a smaller set of documents to test the tool’s performance.
Iterative Refinement. The team started out by running multiple iterations of responsiveness and privilege prompts on a 1,000-document test set. Early mismatches between Coding Suggestions and attorney coding were used to sharpen the prompts.
Validation Sampling. Before full deployment, the firm ran revised prompts on 100-document subsets to ensure alignment with attorney logic.
Expert Prompting. The team worked together to capture the nuances of insurance language and technical engineering reports to ensure Coding Suggestions was able to identify and comprehend these complex terms.
"We did this ourselves so that we were completely defensible," the Director of Litigation Technology and Support at the firm said. "If anyone came back and said, 'How did we get to these figures? What's the precision and recall?' we had to be able to provide that to them."
Quantifiable Results: 93% Precision at Lightning Speed
The impact of the Coding Suggestions experiment was immediate and measurable.
Once the AI’s performance was validated, the team deployed their refined prompts at scale across the approximately 43,000 non-privileged documents. Of those, the AI surfaced roughly 6,000 documents that were identified as “Responsive Yes.”
To verify the integrity of this set, the team conducted a validation sample of 456 documents, which confirmed a 93% precision rate.
In the context of Coding Suggestions, precision is defined as the measure of how well the tool’s predictions match reality. It’s calculated by comparing the proportion of documents that are actually relevant to the documents the Coding Suggestions algorithm classifies as being relevant.
The team then applied additional controls to refine results, including secondary prompts to remove unrelated properties, date range restrictions, and targeted searches to capture missed responsive documents. This process reduced the final production set to approximately 1,000 documents, all of which were reviewed by attorneys at a secondary level.
The firm also conducted recall sampling to understand the completeness of the review.
Responsive No: Achieved 93% recall.
Responsive Soft No: Achieved 60% recall. Because of this lower rate, the team conducted targeted follow-up searches and date-range restrictions to ensure no critical evidence was missed.
The speed of the review was perhaps the most transformative result for the team. It allowed attorneys’ time to be redirected to high-value second-level validation instead of first-pass document review. Being able to meet aggressive deadlines without compromising quality and defensibility proved to be a game-changer.
Since the review process was faster and didn’t require the use of a human review team, the cost savings the firm reported were much lower than they would have been with a traditional review. They were able to complete the review at just one-fifth of the cost compared to what it would have been using a full contract review team.
Beyond Responsiveness: Privilege and Issue Triage
Coding Suggestions helped identify the case-specific themes that were central to the dispute through the coding criteria the team uploaded to the tool. Coding criteria includes descriptions of the case, the code categories, and the individual codes. These are created using natural language, rather than needing complex Boolean searches.
This provides the model with the necessary context and guidance for what the codes are meant to capture and how to evaluate whether they apply to a document.
Rather than just looking for keywords, Coding Suggestions understood the context behind site access delays, inadequate mitigation efforts, and conflicting contractor statements. It effectively parsed through technical engineering reports and was able to comprehend nuanced insurance terms like "failure to mitigate" and "faulty workmanship".
The firm applied this workflow to triage approximately 24,000 documents in the potentially privileged bucket. This privilege preview allowed attorneys to identify protected material more efficiently, though the firm maintained an eyes-on policy for final privilege determinations.
“We had about 24,000 documents that were in privileged buckets,” the Director of Litigation Technology and Support at the firm said. “And while we had our team review each one of those, we also wanted to see how effective Coding Suggestions would be at reviewing for privilege, and it did a great job.”
By using Coding Suggestions to flag key issues and review for privilege, the team ensured that their focus was directed toward the most critical evidence first. This shifted the attorneys' role from fact-finders to strategists, allowing them to spend more time building their case.
The Future: Standardizing Innovation
With a 93% precision and recall rate across thousands of documents and project completion in a fraction of the time at one-fifth of the cost, the success has solidified Coding Suggestions and the EverlawAI feature set as a cornerstone of the firm’s litigation strategy. Attorneys can now spend more time focusing on high-value legal work, and the firm can save money by keeping more work in-house and completing document review faster.
The firm’s general counsel has already issued a memo approving Everlaw as one of only two AI tools authorized for firm-wide use.
“Hopefully we can implement this same type of workflow for all our cases,” the Director of Litigation Technology and Support at the firm said.
The goal now is to have universal adoption across the firm. They see AI providing a critical competitive advantage that allows them to deliver a superior work product while saving clients money.
"We’re very satisfied with EverlawAI's capabilities," the Ediscovery and Litigation Support Coordinator at the firm concluded. "The platform is easy to understand and use for our attorneys, and the AI workflow is user-friendly, intuitive, and easy to navigate."
By combining deep legal expertise with a rigorous approach to AI, the firm has proven that traditional review processes are no longer the only path to victory. With Everlaw, they are not just keeping pace with the changing legal landscape—they are leading it.
Justin Smith is a Senior Content Marketing Manager at Everlaw. He focuses on the ways AI is transforming the practice of law, the future of ediscovery, and how legal teams are adapting to a rapidly changing industry. See more articles from this author.