Reports of the Death of Keyword Searching Are Greatly Exaggerated

This week on the Advanced Discovery blog I noted that several websites are recirculating a article, first released in January just before LegalTech New York, entitled And the Judges Say: It’s Time to Adopt New Legal Technologies.  It’s an excellent piece that was an introduction to a great LegalTech session with Judges Andrew J. Peck (SDNY), James C. Francis (also S.D.N.Y.), Elizabeth D. Laporte (N.D. Cal.), and Pamela Meade Sargent (W.D. Va.), in which the judges discussed what they are currently seeing in their courts regarding big data, analytics, e-discovery and other technologies.

However, part of the article is a quote from an interview with Judge Peck in which he says, “I think there’s just too much data to try and do it the old fashioned way. That’s whether you’re talking really old fashioned with eyes-on-everything for review, or the still-old fashioned in my view use of keywords.”

With all due deference to Judge Peck, who I respect and admire both professionally and personally, I’ve disagreed with that position in the past and I still disagree with it.  Here is my reasoning.

Much of the lack of confidence in keyword searches is laid at the feet (or pen) of Judge John Facciola, in the case of United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008), with his famous quote about going where angels dare to tread. But that’s not exactly what Judge Facciola opined in that case. He actually dismissed a defendant’s objection to the adequacy of keywords used by the prosecution and ruled that a party challenging the efficacy of an opposing party’s search terms must do so through expert testimony.

His hesitancy was about the Court itself undertaking the complexity of search in the identification and production of electronically stored information. To that specific point, he stated that “[w]hether search terms or ‘keywords’ will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics. … Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.”

To be fair, in an earlier decision, Disability Rights Council of Greater Washington v. Washington Metro Transit Authority, 242 F.R.D. 139 (D.D.C. 2007), Judge Facciola did, in fact, state that concept searching is more likely to produce comprehensive results and is more efficient than keyword searches. However there, as in O’Keefe, he questioned the litigants’ ability to demonstrate to him that the results were defensible.

I’ve noted numerous times in the past six months that despite the strong scientific evidence in support of TAR, numerous polls from organizations such as eDJournal and Kroll, as well as client surveys from firms such as Gibson Dunn and Norton Rose, show that the majority of people are NOT using TAR. And if they are not, it seems clear they must be using keyword searches.

This position was supported publicly last year at the Today General Counsel conference in New York by Gene Eames, Pfizer Inc.’s director of Search and Analytics in their Legal Division. Gene made it clear that he is strongly in favor of keyword searches IF the keywords and the results can be tested and validated. His point was that you use keywords to propagate a seed set for the eventual computer search, so why not use it on all the documents as a first pass.

Several people noted that pointing any search tool at all the data may be costly and inefficient. Co-chair David Kessler of Norton Rose Fulbright stated, “If I’m playing hide-and-seek with my kids and it’s my turn to seek, I’m not looking in the breadbox. They won’t be there because they don’t fit there.”  That comment prompted Gene to recall a discussion he had with a federal judge about the best way to proceed in a search, where he said if he returned from a meeting at the courthouse to find he had lost his keys he wouldn’t begin a search in Penn Station; he’d start in the lobby of the courthouse.

Everyone in the conversation did agree that the best practice is to bring some common sense to your search process. Technology is great, but it’s not an “Easy Button”, and the best technology for your project depends on a number of variables, including budget, time constraints and search needs. As Maura put it, “TAR is a process, not a product.”

The point about keywords was brought home again several weeks ago at the ASU Arkfeld EDiscovery Conference in Tempe, when I spoke on a panel about keyword searches to an SRO crowd. As I said in my recap of that session, “…keyword search is far from dead and is probably still the most common search method used by eDiscovery staff. TAR may be up and coming, but keyword search is still king.”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: