Building Our Maps: Legal Infrastructure in the Age of Data

New technologies, like artificial intelligence and document automation, could dramatically alter the landscape of the legal profession.

Jason Boehmig, Tim Hwang, Paul Sawaya, Law Technology News

June 24, 2014    | 0 Comments

When Google set out to build a self-driving car, it first had to map the physical world. In short, Google had to figure out how to make a red light hanging from a pole—or a person crossing the street—into something a computer could understand and act on. To do this, a myriad of analog data points need to be gathered—by radar, GPS, and other sensors—before they can be turned into the digital signals that can be compared and analyzed. It’s this key infrastructure, akin to building a digital map of an analog world, that enables truly revolutionary technology.

In fact, this process has repeated itself countless times in the digital age. Things that were once analog—phone books, bulletin boards, even photos—have undergone a transformation into new digital forms, which are capable of being understood and manipulated by computers. Once data is in a format that a computer is able to understand, the possibilities are endless—books can be copied and distributed in nanoseconds, bulletin boards become places for global commerce, and photos can be copied and edited in ways that were never possible with a purely physical medium.

Law has not escaped this trend. Projects such as LegalXML and Akoma Ntoso have realized the value of having legal documents in machine-readable format and have set forth standards by which lawyers or administrators might markup the documents they produce. This markup is then readable by machines and can be manipulated once it is in the specified format.

The problem thus far in law has not been the lack of purported standards for data entry, but rather that many documents have not made it in to any machine-readable standard at all. The reasons for this are perhaps obvious—formatting data in a machine-readable way takes additional effort and many lawyers simply don’t have time. Additionally, many lawyers are averse to technology to begin with, and even if they do see the benefits on the horizon, it’s hard to move off the starting block where it won’t make a difference in handling the case on their desk at the moment.

We propose that this problem is simply too important to be left up to lawyers alone. Lawyers need to reach out to technologists and get their help in figuring out ways to bring machine-readable documents into every day practice. There are people doing great work in this field already, like the State Decoded and the Madison Projectfrom OpenGov. They are making great progress for legislation and statutes.

It’s time for the legal profession as a whole to catch up.

How can we do better? One way forward might be to implement tools that will make the process for gathering and formatting legal documents automatic. For instance, now that some of these projects are putting legal documents up on the web as text (and we’re even seeing governments like the City of San Francisco join this trend), it might make sense to have a tool that can take the raw text of documents and use the work that others have done as a jumping off point for adding more useful context and formatting automatically.

The Restatement project is an effort to move in that direction (disclosure: the authors all participate). It is a free, open source effort to create a standard—and more importantly, a method for getting documents into that standard—for legal text on the web. It works by taking existing text (even in Microsoft Word document form) and using a parser to extract additional information from the raw text, enabling more functionality.

One example of how Restatement works, in the transactional law context, is Series Seed documents. These have been up on GitHub for quite some time. They are a set of open source financing documents for use by startups in raising seed capital that have been widely adopted for by the startup and venture capital community. By parsing the Series Seed documents with Restatement, additional layers of functionality are unlocked. For instance, instead of merely having access to the documents, the Series Seed documents when passed through this project can be manipulated and filled out online.

In a way, what’s currently missing from the legal technology landscape is the Google StreetView cars that drive around America, sucking in data and pictures from every conceivable point and capturing the physical world in digital form. Everyone seems to agree that the potential for a sea change in the legal profession is enormous. There are a number of technologies, from artificial intelligence, like IBM Watson to new search algorithms for case law, to advanced document automation programs, that could dramatically alter the landscape of the profession itself.


