And then someone mentions PACER and all bets are off.
I start talking about PACER and before I know it I turn into a wild-eyed ranting and raving hillbilly threatening to “break my foot off in the ass of whatever government contractor made that site and keeps it running.”
No, seriously, I accidentally went off on a PACER tangent once during a presentation and that’s pretty much an exact quote. To an audience of Canadians, no less.
PACER was down most of last Friday, although it’s unclear at this writing if it was a DDoS or an internal glitch. Unfortunately it came back, so I thought this might be a good time to write about it and get it out of my system. Because Good God, y’all, I really hate PACER.
PACER, by the way, is the Public Access to Court Electronic Records system, which manages the electronic filings for the United States Federal Court System (but not the US Supreme Court). The PACER acronym is funny not just because it is shared with one of the most ridiculous cars to come out of Detroit, but because the words it represents have ended up being pretty much the exact opposite of what it actually is.
Oh, wait, that’s not funny. That’s fucking tragic.
Lemme break it down for you…
PUBLIC. I always assumed that the Public in PACER was the same as the Public in Public Libraries and that all the information contained within it was free to use. That’s my librarian bias showing, I guess, but I still think it’s valid interpretation. However, it’s actually more like the Public in Public Records, which means everyone is free to look at them.
Having many civil servants in my family, I know government agencies aren’t rolling in cash and there’s an argument to be made that charging for the cost of copying and accessing files keeps individuals from abusing the system and wasting resources. There’s a counter argument to made, of course, that the agencies should budget for these requests and punishing everyone that wants to access records created with their tax dollars instead of creating finite system to weed out waste is inherently unfair.
And then there’s PACER, which charges a $.10/page fee for merely looking at a document on a computer screen. Your computer screen. Whether or not you decide to print it out on your paper on your copier using your toner. To the tune of $150 million dollars in PROFIT a year. The Administrative Office of Courts, the federal agency that runs PACER and the federal court system, tries to ameliorate this by not charging you if you stay under $15 of charges a quarter and limiting per-document costs to $3. (I’ll have a little more to say about this in the next section – ACCESS) Actual court decision viewing is free, and there’s CourtWeb, which has some cases that some judges decide to put on there without any clear indication of the holdings and level content. There’s also apparently a way to get your entire PACER fees waived (appearing in forma pauperis doesn’t guarantee it, interestingly) but I gave up on trying to find the procedure for that after about a half hour of looking.
A few years ago there was a pilot project to make PACER free to use in government depository libraries, but then Aaron Schwartz and Carl Malamud took them up on the offer and it was quickly closed down. Around this same time RECAP was developed. RECAP is a browser add-on that harvests the materials you look at on PACER and deposits them in the Internet Archive for all to use freely. (Of course, some courts make using RECAP a violation of their Terms of Service….) I applaud the efforts of the RECAP developers and contributors, although we must be realistic in that it will take a huge buy in from the public to create a usable database and the fact that all it does is remove the cost aspect of PACER. Which brings me to….
ACCESS. Despite my government software induced outbursts, I am a well educated, comfortably middle class person. I am very comfortable using computers. One of my graduate degrees is actually IN managing information. I’m also not under the stress of litigation in the federal court system. Even still, with all that in my favor, I found the process of registering for a PACER account incredibly hard.
But there’s a larger problem. Access to information isn’t really access if it’s not meaningful access. The search functions on PACER are horrible. Incredibly useless. I mean, look at the ADVANCED search options…
There is a box for “nature of the suit” but all that does is limit your results to types of case with a taxonomy tag of which there’s no obvious definition.
Print research is different than electronic research. In print research, you are looking FOR something. You go to an index or table of contents and are directed to areas of information in the info container (probably a book or series of books). A human being somewhere along the way has determined that these are the areas that have information actually about that subject. Not every single mention of it, but actually about it. Even if the specific word that you are using for the subject isn’t used. In a discipline like law which hinges on the use of a language that has changed significantly in the 200 years of it’s existence, this human editorial assistance is hugely important.
In electronic research, you are not trying to find something. You are trying to eliminate possibilities through search limiters. Back in my legal research professor days, I would have students trying out Lexis or Westlaw for the first time and come to me excited because they found so many hits to their search. I only felt slightly bad about crushing their hopes and dreams. In electronic research, less is more. Especially when you have to pay a fee to look at each document to determine if it’s actually relevant. Which is why it’s incredibly unfair to have such a terrible and useless search interface for researchers to use.
(On a tangent, this is why I think an open legal taxonomy is needed and why full text searching of case law is currently not sufficient, although my colleague Elmer Masters has shown me some impressive “predictive cataloging” of case law using SOLR. One day it may work. Not now, though.)
But here’s the thing. PACER isn’t a research database. It’s not America’s CANLii, as much as we would wish it to be. It’s a PDF dumping ground for papers filed in federal courts. Every motion, brief, filing and ruling of every case in the federal courts, 99% of which is useless to the public at large. And at that, it excels. Which brings me to…
COURT ELECTRONIC RECORDS. PDFs aren’t electronic records, y’all.
Say it with me: PDFs aren’t electronic records.
PDFs are a print artifact. They are pieces of paper that live in your computer. Why does this matter? Because of a little thing called <drumroll> METADATA. You know it, you love it, you may not realize it, but it makes the world go ’round. Or not, in the case of PACER. When you are searching a commercial database and want to limit your search to the syllabus of the decisions, it’s not magic that it does that or the computer “just knows” to do it. We haven’t reached the singularity….yet. It know what part of the decision is the syllabus because someone attached some metadata to it saying “HEY. HEY YOU THERE. THIS PART? THIS IS THE SYLLABUS OF THE DECISION.”
That’s not what the code actually looks like, by the way. You know. Just in case you were wondering…
There are some hardy souls out there who are taking these PDFs and using something called Optical Character Recognition (or OCR) to put them in a truly electronic format. This will allow for metadata to be added to improve the searching capabilities of these materials. It’s a band-aid fix for something that needs welding. Which is not to say that they should stop, but we need to also recognize that OCRing can lead to inaccurate replication of text and again, in a discipline as language specific as law, this is ultimately unacceptable.
I know legal information and research isn’t sexy. Currently most of the oxygen in the legal technology space is being consumed by people that want to “disrupt” it. I would love to disrupt the legal information part of legal technology, but before we can even begin to get to that point, we need to fix it. The inadequacies of PACER make it clear that the government is unable or unwilling to create usable legal information. Our only hope is that the courts realize this fact and make the information freely available, though projects like Court Cloud, so that others can take a stab at it.