A large health care provider in California faced an inquiry from a regulatory body. They had a large volume of documents to produce, and an extremely short response window. Further, the regulatory agency required documents to be produced in a very specific format.
The client initially faced a data set of 970K documents - far too large to conduct traditional review on within the extremely strict timeframes involved in regulatory inquiries. With the production deadline swiftly approaching, we recommended using continuous active learning to defensibly reduce the data set to prioritize documents most likely to be responsive for review.
Our eDiscovery team trained the continuous active learning system to identify relevant documents in the massive data set, and then queued those that were highly likely to be responsive for review by human reviewers.The responsiveness rate was extremely high, but even still we were able to reduce the total body of documents for manual review to less than 35K documents.
The final elusion rate for the full data set was 0% - a perfect measure proving that no responsive documents were missed. Once the active learning process was complete, we then applied search terms, deduplication, culling, and data reduction strategies to the remaining data set - to reduce the final review set down further to 16.5K documents.
Our client was looking conservatively at more than $1M in review costs and an impossible deadline. Our team helped them hit the deadline and reduce their case cost by more than 80%.
By the numbers:
- 970k documents needed to be reviewed on a compressed timeline
- 0% final elusion rate for the full data set
- 80% total potential matter costs saved
- 960k+ total reduction of document population for review