Вы находитесь на странице: 1из 29

Search in the Electronic Discovery Reference Model (EDRM)

when, where and why may matter

George Socha, Esq. Gene Eames, Consultant

Context of Search Discussion


Electronic Discovery Reference Model (EDRM)

Context of Search Discussion: Assumptions

Textual search of electronically stored information ("ESI") is a necessary part of many business and legal endeavors Not all native computer applications offer adequate text search functionality Textual search of ESI requires special indexing of content for effective and efficient results Not all text search tools or environments are the same There are other technologies used for analyzing ESI textual content, which may or may not be directly related to search The where, when and why of search requires consideration of the tools, environment, consequences and practicality of a search endeavor

Where

When
Post-Triggering Event Pre-Review, Concurrent w/ Review Post-Review Normal Course of Business

Why
Post-Triggering Event Pre-Review, Concurrent w/ Review Post-Review Normal Course of Business

Knowledge Management

Assessment / Scoping

Review Determination / Review Efficiency

Fact Finding / Witness Prep

These distinctions matter


The why of your search should dictate the measure of your success The when and where may play a role in determining that measure of

success
The why may introduce legal implications, and the necessity for

workflow and process enabling assessment of "reasonable" yet "proportional" results


The when and where may or may not support a practical assessment

of "reasonable" yet "proportional" response

How legal implications arise in search


Low Legal Consequence: Search used for fact finding or specific

document identification will likely not require proof of the comprehensiveness of the search attempt, or require documentation of a reasonable search process if the document or specific fact is actually returned by the search
High Legal Consequence: Search used to determine the ESI that

will be reviewed and produced in an e-discovery event, and conversely that which will be left out of review requires a documented process to enable improved accuracy and comprehensiveness of the search attempt

Examples of low and high legal consequence search in the EDRM


Low legal consequence search
Knowledge management search measure of success is finding something

helpful and of interest


Case Assessment search measure of success is accumulating information to

make determinations on scope or approach in handling a matter; likely requires less scrutiny than operationalizing a process resulting from such assessment
Review efficiency search measure of success is adding efficiency to the review

of an already determined review population where inaccuracy will not necessarily result in missing documents in production

High legal consequence search


Review determination search measure of success is having comfort in

assessment of comprehensiveness of the search result to withstand a defensibility challenge balanced with an avoidance of over-inclusiveness
Privilege search as a form of review efficiency search this has a special legal

impact as the potential for waiver of privilege may depend upon the reasonableness of the search process and results so the measure of success is similar to review determination.

How the when and where can adversely affect search with legal implications
Search "in-place" can introduce issues related to dynamic search

indices and data sets affecting the comprehensiveness of the search attempt
Are the search indices complete at the time of search? Is the source data stable, or are there possibly changing file locations, pointers or

links in the data or search index?


Is the scope of the search transparent?

The available tools may not account for all items sufficiently for

search having legal implications.


Exceptions
Password protected Corrupt Processing/Extraction failures

Non text file types

How the when and where can adversely affect search with legal implications
Available tools may lend themselves to one search objective more

than another
Ad Hoc Search (suitable for fact finding and assessment searching) Process-driven search, allow measurement of results at the macro and micro

level (suitable for search with legal implication)


Allow for practical iteration of search attempts - offer opportunity for and

documentation of criteria modification, and validation of results (suitable for effective search process involving legal implications)

Issues with Recall and Precision when using textual search on

unstructured data
Recall search results are inclusive enough (defensibility) Precision search results are not overinclusive (proportionality)

Ad hoc and process-driven search

Ad hoc search is a largely manual and interactive way of searching a data set in which a user constructs a single query with varying degrees of complexity and retrieves the results of that one query for immediate perusal.
Suitable for:
Research Testing Fact finding Document location

Process-driven search allows scripting of a series of concurrent queries with varying degrees of complexity, recording the results to allow measurement of the interaction of individual queries and overall impact of the set
Suitable for:
Large numbers of queries Iteration of entire sets of queries Measure impact of sets of queries Auditing, tracking and documentation of iterative search attempts Practical sampling and validation opportunities

Example of documentation from process-driven search

Considerations for search having legal implications


Assess comprehensiveness of search attempt
Indices are stable Search universe is defined and complete Text searchable and non-text searchable files are accounted for

Assess effectiveness of search criteria


Good Precision = High Responsive Rate Good Recall = Fewer Missed Items in Review A balance between Precision and Recall will provide more responsive documents

with fewer responsive items missed.

Assuming all docs in collection reviewed

Collection Actual Responsive Actual Privileged Search Result

Good Precision / Poor Recall

Search Term Results Under-inclusive search. Good Candidate for defensibility challenge Not an unduly expensive, but yet incomplete review scenario

Collection Actual Responsive Actual Privileged Search Result

Good Recall / Poor Precision

Search Term Results


Over-inclusive search. Less likely candidate for defensibility challenge Unduly expensive review scenario

Collection Actual Responsive Actual Privileged Search Result

Poor Recall / Poor Precision

Search Term Results


Under-inclusive and overinclusive search. Good candidate for defensibility challenge Unduly expensive and incomplete review scenario

Collection Actual Responsive Actual Privileged Search Result

Good Recall / Good Precision

Targeted search. Unlikely candidate for defensibility challenge Right-sized review scenario as to cost and efficiency

Search Term Results

Collection Actual Responsive Actual Privileged Search Result

Case Law
Victor Stanley Inc. v. Creative Pipe Inc.
250 F.R.D. 251 (D. Md. 2008)

Magistrate Judge Paul Grimm


NOTE: This case specifically relates to a privilege screen search effort, but the principles will apply to all ESI search efforts

all keyword searches are not created equal; and there is a growing body of literature that highlights the risks associated with conducting an unreliable or inadequate keyword search [parties] do not assert that any sampling was done of the text searchable ESI files to see if the search results were reliable. Common sense suggests that even a properly designed and executed keyword search may prove to be overinclusive or under-inclusive [t]he only prudent way to test the reliability of the keyword search is to perform some appropriate sampling of the documents in order to arrive at a comfort level that the categories are neither overinclusive nor under-inclusive. [excerpted and highlighted for emphasis]

Judge Grimms Victor Stanley Case

Search Term Results

the only prudent way to test the reliability of the keyword search is to perform some appropriate sampling of the documents - Judge Grimm

Collection Actual Responsive Actual Privileged Search Result

Case Law
In Re Fannie Mae Securities Litigation,
552 F.3d 814 (D.C. Cir. 2009)

Circuit Judge Tatel


Government Agency (OFHEO) spends $6M on review and are subsequently held in contempt for not meeting production deadlines. After the fact, a dispute arises over a stipulated order stating:

OFHEO will work with the Individual Defendants to provide the necessary information (without individual document review) to develop appropriate search terms.
OFHEO did not participate in the determination of appropriate search terms, but rather allowed opposing party to use untested search terms. They presumably assumed it was the responsibility of defendant to determine appropriateness. This resulted in a large volume of documents to review, which proved very costly, and OFHEO in fact did not complete the task. The court disagreed with assertion that defendant was limited by the stipulated orders language related to appropriateness.

direct[s] OFHEO and the individual defendants to work together, but only to facilitate OFHEO's provision of information to assist in developing search terms

requires OFHEO to provide more: it must furnish that information necessary to


formulate search terms that are not just minimally sufficient, but actually appropriate to the task of retrieving relevant documents.

In Re: Fannie Mae

Over-inclusive search. This caused OFHEO to spend large sums on review. OFHEO had an opportunity as per the stipulated order to provide information that may have resulted in more targeted review

Search Term Results

Collection Actual Responsive Actual Privileged Search Result

Case Law
William A. Gross Construction Associates, Inc. v. American Manufacturers Mutual Insurance Co.
256 F.R.D. 134 (S.D.N.Y. 2009)
Judge Andrew Peck where counsel are using keyword searches for retrieval of ESI, they at a minimum must carefully craft the appropriate keywords, with input from the ESI's custodians as to the words and abbreviations they use, and the proposed methodology must be quality control tested to assure accuracy in retrieval and elimination of false positives. It is time that the Bar-even those lawyers who did not come of age in the computer era-understand this.

Case Law
United States v. OKeefe,
537 F. Supp. 2d 14 (D.D.C. 2008)

Judge John Facciola

for lawyers and judges to dare opine that a certain search term or terms would be more
likely to produce information than the terms that were used is truly to go where angels fear to tread if defendants are going to contend that the search terms used by the government were insufficient, they will have to specifically so contend in a motion to compel and their contention must be based on evidence that meets the requirements of Rule 702 of the Federal Rules of Evidence
To liberally paraphrase Judge Facciola one might say prove it. If parties want to contend that particular search terms are sufficient or insufficient a formalized search process capable of documentation, measurement, and validation should be available, accompanied by individuals wellversed in such information retrieval efforts.
- Daegis EDAC

What Industry Thought Leaders Say


Text Retrieval Conference (TREC) Legal Track
TREC Legal Track, sponsored by NIST, is a litigation-focused research project that explores and exploits the limitations of search technology in litigation. The ultimate goal is to develop objective criteria for comparing methods of searching large collections of documents in civil litigation. TREC Topic Authorities have made the following five observations:
1. Information retrieval in the legal context is a difficult task that requires expertise. 2. In order to do well, information retrieval requires the combined efforts of a team possessing varied skill sets. 3. Relevance determinations reflect highly subjective judgment calls, made by a particular lawyer or legal team, in light of the demands of a particular information request, at a particular point in time. 4. Information retrieval in the legal context is not a mere search-engine problem, and technology, alone, will not solve it. 5. It is necessary to spend time with the information set in order to understand it.

What Industry Thought Leaders Say


Sedona Conference
[iterative feedback opportunities] allow integration of what a case team learns after each exercise of the process in order to calibrate and maximize the technologys capability to identify relevant information. It is through this feedback that case teams will acquire sound information to use in making both strategic and tactical decisions.

[sampling techniques] can help the review team to rank the precision and recall of various terms or concepts.

defensibility in court will very likely depend on the implementation of, and adherence to, processes developed for use with a search and retrieval technology.
Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (08/2007)

What Industry Thought Leaders Say


EDRM Search Guide January 20, 2009
ways to improve retrieval effectiveness are to iterate multiple times with newer search queries

There are several validation methodologies that should be used throughout the development of the search criteria to be used for selection of documents for attorney review. [Many involve the case team] reviewing samples of documents to determine litigation relevance to classify documents as Responsive or Not Responsive to the issues of the case and therefore increasing the precision of the search results.

Questions

Thank you for your time.

Вам также может понравиться