Jason Atchley : eDiscovery : How to Find Litigation Knowledge in a Sea of Noise

jason atchley

Vendor Voice: From Socrates to Augmented Intelligence

How to find litigation knowledge in a sea of noise.

Marc Jenkins, Law Technology News

February 21, 2014    |0 Comments

Sergey Nivens
Editor’s Note: This article was chosen in a blind competition for the Arizona State University-Arkfeld E-Discovery and Digital Evidence Conference. The three winners have been invited to present their papers during the conference, which will be held March 12-14 at ASU’s Sandra Day O’Connor College of Law, in Tempe, Ariz. Read also, “Vendor Voice: Yes, Counselor, There Will Be Math,” by Maureen O’Neill and Joel Henry’s “Predictive Coding is So Yesterday.”
Socrates was tried and ultimately sentenced to death for failing to respect the gods and corrupting the youth of ancient Athens. This paper is Socratic in its approach, as it challenges current prevailing practices and methods used by attorneys for obtaining information in litigation and advocates for change.
The volume and variety of data is increasing exponentially, putting extreme pressure on lawyers’ ability to meet the goal of Rule 1 of the Federal Rules of Civil Procedure, and the ability to effectively represent clients. To meet these challenges, our methods for gathering information are in serious need of a reboot. In order to find the droplets of actionable knowledge in data sets consisting of a sea of noise, we must first admit what we do not know.
Socrates’ wisdom was based on his admittance that he knew nothing, and his belief that the unexamined life is not worth living. Admitting that he knew nothing, he questioned the elites and leaders of ancient Greece to find answers to important questions. It led him to discover that those perceived to have the answers actually knew little. It was through his questioning and willingness to learn that led to the discovery of real knowledge. The Socratic method is a method for hypothesis elimination. Better hypotheses are found through the constant scrutiny of questioning and testing of assumptions and conclusions.  In modern times, we see this approach in Edison’s famous remark, “I have successfully discovered 1,000 ways to not make a light bulb” and the Google mantra of “fail fast.”
Generally speaking, humans are overconfident in our cognitive abilities or unaware of cognitive biases. Daniel Kahneman (who received the Nobel Prize for economics) wrote in” Thinking, Fast and Slow” that overconfidence is explained by a concept labeled “WYSIATI” which stands for “What You See is All There Is.”  The WYSIATI theory states that the mind deals primarily with “known knowns” or phenomena already observed when it makes decisions.
On only rare occasions, the mind considers “known unknowns” which are described as phenomena that the mind knows to be relevant, but about which it has no information.  The mind is generally oblivious to the possibility of “unknown unknowns” or unknown phenomena of unknown relevance. The secrets to the case that change everything typically exist in the unknown unknowns realm.
Whether referred to as hot or key documents, this is where risk lies and outcome is determined.  In the digital world, the discovery of these pieces of data is extremely challenging. The typical legal search methodology workflow is over reliant on the selection of keywords based on the perceptions of what the case is about at the stage when those selecting the keywords know the least about the case. Therefore, the typical litigation information retrieval process will help you find more Known Knowns and perhaps some known unknowns but will be of little practical assistance in finding unknown unknowns.
In “Clinical v. Statistical Prediction: A Theoretical Analysis and a Review of the Evidence” (1954), Paul Meehl, a leading American philosopher of science and psychologist who had faculty appointments in the University of Minnesota’s psychology, law, psychiatry, neurology and philosophy departments, reviewed 20  studies examining whether statistical predictions made by formulas (algorithms) were more accurate than clinical predictions based on the subjective impressions of trained professionals. He found the algorithm to be more accurate than 11 of 14 trained professionals.  Clinical psychologists found the conclusions shocking and were highly skeptical.  Thus, Meehl’s work sparked a wealth of research that is still occurring and highly relevant in today’s data driven society.
Since Meehl’s initial research, there have been approximately 200 studies comparing statistical predictions with clinical predictions. The findings have been consistent and in the favor of algorithms over humans with approximately 60 percent showing significantly better accuracy for algorithms.  The remaining comparisons found a draw, which is a win in the algorithm column because a statistical prediction regime is less expensive than expert judgment.
The limited studies available in litigation are consistent with Meehl and the follow-up studies findings.  Statistical formulas outdo humans in noisy environments such as modern day litigation with millions of documents because they are more likely than humans to detect weak but valid cues and much more likely to maintain a modest level of accuracy by using cues consistently.
Importantly, these studies are not evidence of true artificial intelligence.  Rather, the algorithms are more accurate predictors. It would be wrong to include those algorithms generally speaking and more specifically that Bayesian classifiers and others used in predictive coding or technology-assisted review software are very accurate. Predictive coding has been referred to as search terms on steroids. However, a search term or keyword regime is not designed and is unlikely to yield the important who, what, when, where, how and why information needed to evaluate risk, enter settlements and strategize for motion practice and trial.  Attorneys who rely solely on these technologies for information retrieval are in danger of incurring a “black swan” event during the course of the litigation and only finding known knowns.

Read more: http://www.lawtechnologynews.com/id=1202643988509/Vendor-Voice%3A-From-Socrates-to-Augmented-Intelligence#ixzz2tyvoGReF

Posted by at 12:17 PM 

One thought on “Jason Atchley : eDiscovery : How to Find Litigation Knowledge in a Sea of Noise

  1. Pingback: Jason Atchley : eDiscovery : How to Find Litiga...

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s