Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bennett B. Borden Conor R. Crowley Wendy Butler Curtis

Similar presentations


Presentation on theme: "Bennett B. Borden Conor R. Crowley Wendy Butler Curtis"— Presentation transcript:

1 Bennett B. Borden Conor R. Crowley Wendy Butler Curtis
Technology Assisted Review: A Kissing Cousin to Autocategorization? February 27, 2013 Bennett B. Borden Conor R. Crowley Wendy Butler Curtis

2 Agenda How is TAR similar to auto-classification?
How does TAR differ from auto-classification? Why do lawyers disagree over the use of TAR?

3 Why? Managing information tells us WHAT we have and WHERE it is, so we can: Be better prepared to respond to litigation Reduce production costs during discovery Comply with legislative and regulatory mandates Increase staff productivity Improve client service Reduce records storage costs Reduce liability insurance premiums

4 Routine Tasks? Retention & Disposition Legal Holds/Protective Orders
File Transfers/Lateral Movement of Attorneys KM/Precedents BI

5 Existing Tools Are Not Enough
No one content repository meets all needs DMS, RMS, Litigation Support, Extranets, Portals, etc., etc., etc. The nightmare of Attorneys live in Outlook Outlook is not designed to be a records management solution Archive Not designed to easily address information lifecycle management

6 From Reactive to Proactive
Using what we’ve learned in eDiscovery to Govern Information Better

7 #1 Problem: Too much ungoverned information

8 Data Volumes Continue to Grow
Estimates by analysts and research shows that: Each year 1,200 exabytes of new data will be generated 650% enterprise data growth in the next 5 years 80% of this data will be unstructured generated from a variety of sources such as blogs, web content, , etc. 70% of this data is stale after ninety days.

9 Unknown Information Has No Value

10 How Did We Get Here? How Did We Get Here? Years of information
overload build up Many different systems used from to file shares, from portals to content management systems No one ever cleans up when they leave Challenges in getting buy in from firm management

11 Gaps Between Expectations & Practice

12 Humans Aren’t Good at Classification
Manual categorization has a average accuracy of only 60%

13 Using Predictive Technologies
Software “that use[s] sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e., training by) a human reviewer.” - Da Silva Moore v. Publicis, 2012 U.S. Dist. LEXIS (S.D.N.Y. Feb. 12, 2012) (Peck, J.).

14 How Does it Work? How Does it Work? Index Content Categorize
Apply Policies File Systems Archives ECM Systems RM repositories SharePoint Automatic Manual Learning system Train by example Multi-action policies Rules based Copy Move Delete Lock Shortcut

15 Merges categorization methods
Manual Categorization Great for small data sets Creates best “training” data sets Rules Based Great for eliminating the obvious stuff Powerful when content has good metadata Can be used to enhance “training” data Supervised Learning Locates errors in rules and manual categorization Offers highest levels of precision and recall Is not dependent on metadata

16 Supervised Learning Supervised Learning
1. Human categorization of sample content 2. “Training” algorithm is run against category 3. Computer “Suggested” content is reviewed 4. Review of “Suggested” content 5. Content is auto-categorized

17 How it is Being Used How it is Being Used Data Remediation
Classification in repositories Classification upon creation

18 The IGRM (Information Governance Reference Model

19 The Business Knows the Value
“The line of business has an interest in information proportional to its value, the degree to which it helps drive the profit or purpose of the enterprise itself. Once that value expires, the business quickly loses interest in managing it, cleaning it up, or paying for it to be stored.”

20 Legal and RIM Legal and RIM
“…it is the legal department’s responsibility to define what to put on hold and what and when to collect data for discovery. Likewise, it is RIM’s responsibility to ensure that regulatory obligations for information are met including what to retain and archive for how long…”

21 Expectations of RIM √ Establish the foundation of good RIM policy
Many have formal policies but only 43% are confident that their retention schedule is legally credible. √ Manage information consistently. 83% are unable to even locate hardcopy records when needed. What about managing them?

22 The Role of IT The Role of IT
“IT stores and secures information under their management. Of course their focus is efficiency and they’re typically under huge pressure to increase efficiency and lower cost… …IT doesn’t know and can’t speak to what information has value or what duties apply to specific information.”

23 TAR Case Law

24 Open the Kimono Wide: Da Silva Moore
Da Silva Moore v. Publicis Groupe et al., 2012 WL , (S.D.N.Y. Feb. 24, 2012), adopted 2012 WL (S.D.N.Y. April 26, 2012) Open the Kimono Wide: Da Silva Moore Regarded as the first published judicial opinion approving the use of predictive coding as a valid and defensible method of identifying relevant documents for production Predicates its approval of predictive coding on great transparency offered by the producing party as to its relevance determinations—and an ability of the requesting party to refine these decisions

25 Major Aspects of the Da Silva Moore Protocol
The requesting party entitled to suggest key search terms to segregate documents used to create the seed set. The producing party shall provide “All of the documents that are reviewed as a function of the seed set, whether [they] are ultimately coded as relevant or irrelevant, aside from privilege….” The producing party will disclose all relevance coding for its seed set. The parties shall meet and confer about any documents in the seed set that the requesting party believes were incorrectly coded The requesting party shall have similar input at each wave of the iterative “training” process

26 Potential Limiting Factors for Da Silva Moore
Employment dispute means that Plaintiff ex-employees are familiar with Defendant’s internal “jargon” and information repositories Both sides employed e-discovery consultants and were well funded Both sides agreed, in general principle, to the validity of predictive coding as a culling tool (but disagreed on its implementation) Predictive coding implemented through an extremely detailed protocol

27 Kleen Products, LLC, et al. v. Packaging Corp. of Amer. , et al
Kleen Products, LLC, et al. v. Packaging Corp. of Amer., et al., Case: 1:10-cv-05711, Document #412 (ND, Ill., 2012) Plaintiffs moved to compel to try to require redo of search and production using predictive coding. Defendants had used sophisticated iterative Boolean Keyword searches with QC samplings. P’s argued D.’s approach would only capture 25%, but predictive coding would find 75%. Two days of evidentiary hearings and numerous other conferences.

28 Kleen Products, LLC, et al. v. Packaging Corp. of Amer. , et al
Kleen Products, LLC, et al. v. Packaging Corp. of Amer., et al., Case: 1:10-cv-05711, Document #412 (ND, Ill., 2012) No ruling, but Judge Nolan indicated her preference for Sedona Principle 6: Responding parties are best situated to evaluate the procedures, methodologies, and techniques appropriate for preserving and producing their own electronically stored information. Settled by accepting defendants methods of search until requests served after Oct. 1, 2013

29 Global Aerospace, Inc. v. Landow Aviation, L. P. et al, Case No
Global Aerospace, Inc. v. Landow Aviation, L.P. et al, Case No. CL (Va. Cir. Ct. April 23, 2012) Second court to permit the use of predictive coding Plaintiffs objected to the use of predictive coding, arguing that it is not as effective as human review. During the hearing, Judge recognized producing party selects review methodology. The receiving party can then raise issues if it does not get what it thinks it should have in the litigation, as in any other discovery scenario. Order allowed plaintiffs to later object “the completeness of the contents of the production or the ongoing use of predictive coding.”

30 In Re: Actos (Pioglitazone) Products Liability Litigation (W. D. La
In Re: Actos (Pioglitazone) Products Liability Litigation (W.D. La., July 27, 2012) The plaintiffs allege that Actos, a prescription drug for the treatment of type 2 diabetes, increases the risk of bladder cancer in patients. On July 27, 2012, United States Magistrate Judge Hanna Doherty of the Western District of Louisiana entered a Case Management Order outlining the electronically stored information (ESI) protocol the parties must follow during discovery. Court outlines a “Search Methodology Proof of Concept” to examine the performance of defendant’s e-discovery provider’s predictive coding tool for the review and production of documents in this matter. The parties have agreed to collect documents from four of 29 custodians named by defendant, Takeda Pharmaceuticals. These four custodians, added to a set of regulatory documents will be the “sample collection population.”

31 In Re: Actos (Pioglitazone) Products Liability Litigation (W. D. La
In Re: Actos (Pioglitazone) Products Liability Litigation (W.D. La., July 27, 2012) A 500 document random “control set” will be created from the culled collection Three “experts” nominated by each side will jointly review the control set. Plaintiff’s “expert” reviewers are required to sign a non-disclosure agreement. The CMO demands a high degree of cooperation. Defendant’s experts are permitted to pre-screen and review the control document set for privileged material. Defendant’s experts will either remove or redact privileged documents before the control set is reviewed by the panel of experts. The parties’ experts will work collaboratively to determine relevance of the not privileged and privilege redacted documents.

32 In Re: Actos (Pioglitazone) Products Liability Litigation (W. D. La
In Re: Actos (Pioglitazone) Products Liability Litigation (W.D. La., July 27, 2012) Following the review of the Control Set, the experts will review random sample training sets of 40 documents each that the system will select using an active learning approach. This process will continue until a “Stable” training status is reached. The parties will meet and confer regarding which relevance score will provide a cutoff that will yield a proportionate set of documents to be manually reviewed by Takeda for production. All of the documents above the agreed upon relevance score in the sample collection population will be reviewed by Takeda. The CMO provides for meet and confers throughout the process, including a post-predictive coding sampling meet and confer to “finalize the search methodology on a going forward basis.”

33 EORHB, Inc. , et al v. HOA Holdings, LLC, C. A. No. 7409-VCL (Del. Ch
EORHB, Inc., et al v. HOA Holdings, LLC, C.A. No VCL (Del. Ch. Oct. 15, 2012) Complex multimillion dollar commercial indemnity dispute involving the sale of Hooters “Why don’t you all talk about a scheduling order for the litigation on the counterclaims. This seems to me to be an ideal non-expedited case in which the parties would benefit from using predictive coding. I would like you all, if you do not want to use predictive coding, to show cause why this is not a case where predictive coding is the way to go.”

34 Gabriel Tech. Corp. , et al. v. Qualcomm Inc. , 2013 WL 410103 (S. D
Gabriel Tech. Corp., et al. v. Qualcomm Inc., 2013 WL (S.D. Cal. Feb. 1, 2013) Complex patent dispute Defendants won on summary judgment and sought attorneys’ fees: $10,244,053 for Cooley LLP attorneys $391, for document review by Black Letter Discovery, Inc. $2,829, for fees associated with document review algorithm generate by H5 Court awarded all fees sought and specifically found “Cooley’s decision to undertake a more efficient and less time-consuming method of document review to be reasonable under the circumstances.”

35 Why Do Lawyers Disagree Over the Use of TAR?
They don’t understand it Concerns about efficacy Concerns about cost/changes to law firm billing model Level of disclosure required

36 Potential Disclosures
Size of document corpus Size of seed set Seed set selection method Contents of seed set Expert selected to review seed set Methodology for analysis of seed set deemed relevant Sampling methodology with respect to documents deemed relevant/non-relevant by TAR Sample of documents deemed not relevant by TAR Levels of precision and recall achieved and the confidence level for precision/recall metrics

37 Thank you. Bennett B. Borden bborden@williamsmullen. com Conor R
Thank you! Bennett B. Borden Conor R. Crowley Wendy Butler Curtis


Download ppt "Bennett B. Borden Conor R. Crowley Wendy Butler Curtis"

Similar presentations


Ads by Google