A new era of uncertainty is emerging as the global pandemic impacts businesses throughout the...
Vero Beach, Florida civil litigator John Stewart has been appointed president of the Florida Bar last month as reported in Legaltech News, and one of his stated priorities is to improve legal technology knowledge among lawyers in Florida. Stewart was heavily involved in getting the Florida bar to become the first state in the nation with three hours of mandatory technology continuing legal education (CLE). His election as president of the bar puts added emphasis on the importance of technology in legal practice.
We’ve compiled descriptions of twelve commonly-used terms for lawyers who would like to improve their technology understanding:
- Machine learning is a subset of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Instead, machine learning algorithms use human input to perform sophisticated tasks without explicit instructions, instead relying on patterns it detects from human input and making inferences from them. Input can be provided either by passive training, which refers to the user’s behavior interacting with the software, or via active training, in which the users give the software specific types of feedback.
- Machine learning goes by many names in ediscovery: suggested coding, predictive coding, technology assisted review (TAR), computer assisted coding, dynamic review, predictive intelligence, automatic classification, automated predictive review, and others. Fundamentally, these are all the same. Predictive coding or TAR may be used to identify documents related to any dimension of a case, which includes dimensions beyond responsiveness. It may also be used to ensure that nothing is missed in quality control settings. Finally, it may be used to streamline review. Predictive coding takes two forms:
- Simple passive learning is also known as TAR 1.0 in some contexts. This system requires the review team to first review documents, then train the computer model, and then deploy the model across the document corpus. Feedback or corrections to the model requires beginning this process over again, which can be time-consuming.
- Continuous active learning, also known as TAR 2.0, doesn’t require experts to do the initial training. The system can learn from subsequent decisions, such as course-corrections or a re-defining of the review coding structure. The system typically can handle rolling productions (new information) without having to start over. The system still works well when the proportion of relevant documents is low. As a result, this type of process can be more efficient, as it produces fewer docs for the subject matter expert to review.
- A yield curve refers to a graphical comparison of a random sampling of the results (e.g. every 100th document) to the prevalence of the dataset to assess the effectiveness of the predictive coding model.
- Richness or Prevalence both refer to the percentage of documents in the dataset that are relevant.
- Recall refers to the percentage of relevant documents that the predictive coding algorithm has classified correctly. It is used as a measure of completeness.
- Precision measures how often an algorithm accurately predicts that a document is responsive. It is used as a measure of the accuracy of the predictive coding model. Improving precision can be as simple as reviewing more documents in specific portions of that model’s coverage graph (see #8 below).
- There are robust statistical ways to test the results of the system’s reported precision and recall via a holdout set. A holdout set is an unbiased way to measure the performance of the system, based on what reviewers are saying about the nature of the documents. This helps determine how likely the results of the system are to be accurate, based on real documents that we know are rated along key dimensions.
- A coverage graph shows where representative training data has been provided to ensure the predictive coding model knows what it’s doing. Robust use of a coverage graph increases the model’s ability to recognize all of the documents in a set. This can increase the breadth and diversity of the training data set. Ultimately, this can result in improving trust in the system.
For more on understanding predictive coding and technology in this category, catch Everlaw’s upcoming webinar with ACEDS: Improve Review Outcomes with Predictive Coding on Everlaw.
It’s a common misconception that modern software design and development is all about the product....