Many people secretly (or not-so-secretly) hate the term “machine learning.” That’s because it can be overly broad and opaque, as if you’re being asked not to look at the man behind the curtain. However, machine intelligence doesn’t have to be a big black box that we simply accept or reject. Even the less technically-minded among us can make sense of it – from its rationale to its implementation.
Why We Use Machine Learning
Let’s say you need to get from a meeting in New York’s Battery Park City to your office in Midtown East. You have the choice of getting into one of two cabs/Ubers:
- One driven by an experienced driver or
- One driven by an experienced driver who also has a GPS with real-time traffic data
Which of these two cars do you get into?
This is not a trick question. It just seems like one because most us have experience both with this situation and with a GPS. We already know that an experienced cabbie uses his knowledge of the city to consider the fastest route, avoiding potholes and roads with many stop signs. At the same time, a GPS aggregates data from millions of cars to uncover current obstacles — like parades or accidents. So, if you want to save time getting to the office, you would look for both a skilled human and a well-configured machine: option two, above.
Machine intelligence is a tool that we use to augment our training — not to replace it. You need a licensed, experienced driver in either of the two scenarios. Neither a GPS nor an encyclopedic knowledge of Manhattan replaces the need for that, just like an ediscovery platform’s machine learning tool doesn’t replace a litigator. However, both the GPS and the machine learning tool can make their users much more efficient at their respective jobs. So think of your TAR tool as a cabbie might her GPS: it won’t actually get you from point A to point B, but it’ll help navigate the route there.
How Machine Learning Works
In action, machine intelligence is all about scale. If a GPS was only using data from a dozen cars to figure out where the traffic jams were, it wouldn’t be very accurate. After all, what if all of those cars were in the same part of town at once, or if several of the drivers were rushing their pregnant wives to the hospital on the same day? As you get more and more sources, you get a more complete map of a town’s traffic.
The same is true with document review algorithms. Many require a minimum number of document ratings or training sets: this is to give the tool a more complete view of the case. And just like you need data from both the hospital-rushing cars and the non-rushing cars, you should identify both hot and cold documents for the machine to learn from. Once the algorithm has hundreds or thousands of pages to look at, it can more reliably predict what is an all-day lane closure, and what is just one person too friendly with his brake pedal.
This begs the question: how does scale become “intelligent?” After all, even all of the mileage and speed data in the world doesn’t tell you whether a tanker truck or a lane closure caused a slowdown. The answer is that a GPS doesn’t need to “know” if it’s construction or an accident that makes a stretch of road undesirable to drive on. All it needs to do to be useful is to re-route you to a different road.
But, good machine learning tools can actually “figure out” causes too. Think about how a human would do this. If she couldn’t actually see the stretch of highway in question, she could check whether there had been other slowdowns there recently. The reasoning would be that it’s highly unlikely for an accident to occur in exactly the same stretch of highway every single day for weeks. So, if it’s an ongoing issue, it’s likely construction.
A machine algorithm can get a similar result, but without using reasoning. It can look at social media posts or reports associated with incidents. It doesn’t need to know what “accident” means, in order to tell you that it’s often associated with a partial-day slowdown – or what “construction” means, in order to tell you it’s correlated with week- or month-long recurring delays. As Harry Surden points out in his paper Machine Learning and Law ,
There are certain tasks that appear to require intelligence because when humans perform them, they implicate higher-order cognitive skills such as reasoning, comprehension, meta-cognition, or contextual perception of abstract concepts.
But just because we use intelligence to figure something out doesn’t mean that a machine must. Nor is the inverse true: that a machine can’t reach the same conclusion because it doesn’t have a human-like brain.
Using machine learning doesn’t require you to believe in robots that become human. It’s about knowing that it’s a tool that can save you time and that its method doesn’t need to match yours to be useful. That’s machine intelligence we can all get behind.
 Surden, Harry. “Machine Learning and Law.” Washington Law Review. March 24, 2014.