Transparency in Legal Publishing

At the end of the day on Friday, Westlaw disclosed that they had found over 600 errors in the text of print reporters that they had published.  In the email alerting customers to this discovery, they posted a list of all the affected cases.   The cause of the mistake was “due to the introduction of an upgrade to our PDF conversion process in November 2014.”

Wait a minute…West is getting case law text by OCRing PDFs?  That seems weird.

  • Thought one:  OCR is not a perfect science, and in law where language used is so important, I would have thought “the big guys” were doing something like sending the text overseas to be double keyed (aka retyped twice so discrepancies can be found) and then processed into online and print publications.  I guess not.  It would be nice if West took their pledge of transparency all the way and completely explained their process for getting law.
  • Thought two:  Why are they OCRing…are the courts sending them PDFs or are they relying upon scraping the court websites for case law because another publisher is responsible for that state’s legal publishing needs?  My hunch was that the latter was the real reason.  And if so that’s really bothersome because states just post slip opinions without any indication of latter changes.    So I decided to see if I could prove that, but ended up finding something much more interesting.

I took the list of cases that had errors and started looking them up.  I stuck to just the regional reporters because that’s where the state material is published and that’s my focus this year.   The Northeastern Reporter mistakes were Illinois and Ohio.  I had previously contacted the Ohio Reporter of Decisions to see how they distributed cases and was told that the publishers get the material from the website.  But, in the name of due diligence, I decided to check to see who publishes the official Ohio case reporter.  It turns out that it’s West, so there goes my theory about West relying upon scraping because another publisher has the contract to publish case law.


The Ohio Reporter of Decisions posted their contract with West on the website.  (If for some reason this disappears from the web, I’ve downloaded a copy and will post it.)   The legal publication process from states to publisher is sort of murky and here we have a bright shining light into the process. It’s very exciting!  Some interesting bits:

Contrary to what I was told, it turns out that the court does send West copies of its decisions.  Presumably the final copy, not the possibly error ridden slip opinions that the rest of us get on the web.  It is unclear how they do this…just email PDFs or if there’s some special uploading queue that they use.



Why do they do this?  Well, of course someone has to publish the decisions.  But perhaps they chose West because it turns out that the Supreme Court gets a kick back in free books and greatly reduced Westlaw fees.

printelectronic(red line under “specially reduced rate” added by me.) (Redaction of monetary amounts and numbers of volumes done in original.)

So, as a free law advocate, I see now that my ability to advocate to courts that they should make their cases freely available to all is greatly diminished by the fact that I can’t throw in free goods and services as part of the deal.  All I have on my side is the idea that the law should be free to access by all citizens.


  4 comments for “Transparency in Legal Publishing

  1. April 21, 2016 at 6:40 pm

    Wow, that’s appalling. And probably illegal.

    I do like how even though West is showering the Ohio Supreme Court with goodies, they still can’t get an accurate, error-free version of the opinions out of them.

    Great sleuthing, Sarah. I’ll be sharing this post as far and as wide as I can.

  2. April 22, 2016 at 3:36 pm

    Doesn’t it seem strange that the Ohio Supreme Court still has to pay for Westlaw access under this agreement? I mean, the court is giving Westlaw the right to profit from its cases and Westlaw can’t even throw in a free subscription?

  3. May 2, 2016 at 7:58 pm

    You’re going to want to read California’s RFP on this:

    The Reporter of Decisions has the bulk of his staff paid for by the vendor that wins this contract. Our States are wholly dependent upon these contracts and wouldn’t have the staff to publish the law any other way without States prioritizing it and funding staff to do it.

  4. Peter W. Martin
    May 6, 2016 at 3:07 pm

    For a broader sample of state contracts, see:
    As Brian reports, California reaps substantial benefits from its “no cost” contract. So does New York.

    For analysis of the West data glitch suggesting that PDF not OCR was what led Thomson to stumble and drawing out some implications of the episode for the federal judiciary, see:

Leave a Reply

Your email address will not be published. Required fields are marked *