Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Next Frontier in TAR: Choose Your Own Algorithm

Similar presentations


Presentation on theme: "The Next Frontier in TAR: Choose Your Own Algorithm"— Presentation transcript:

1 The Next Frontier in TAR: Choose Your Own Algorithm
LegalTech New York 2017 Presented by Dr. David Grossman, Georgetown University Tara Emory, Esq., PMP, Director of Consulting for Driven, Inc.

2 Introduction What’s inside TAR? The Role of the Algorithm
Leveraging Algorithms to improve TAR Discovery Process Implications Q&A

3 What’s inside TAR? Workflow Software Algorithm

4 What’s inside TAR One software product = one algorithm
Attempts to compare algorithms and products do not isolate workflow vs. software vs. algorithm Other variables include document set and nature of what you need to find

5 Prior work TAR vs. keywords and manual review
Teams with different workflows and software on same issues with same documents Industry tests of one software vs. another on same issues and same documents Different algorithms on same issues with same documents Next Frontier: Different algorithms on same documents, compared to same algorithms on different documents

6 The Role of the Algorithm
Prior work TAR vs. keywords and manual review Teams with different workflows and software on same issues with same documents Industry tests of one software vs. another on same issues and same documents Different algorithms on same issues with same documents Next Frontier: Which algorithms work best for different types of cases?

7 The Role of the Algorithm
TREC Issue Winner at 15% Recall at 15% 201 XGBoost CV -Binary 92 202 LSI 98 203 Logistical Regression –log TF-IDF and LSI log TF-IDF 97 207 94

8 Leveraging Algorithms to Improve TAR
No “best” algorithm for all cases Success of different algorithms varies by Size of document set Prevalence of responsiveness in set Amount of review appropriate for case Availability of best examples to train Broad vs. narrow topics Single vs. many issues

9 Discovery Process Implications
How should attorneys adapt to new understandings of TAR algorithms? How does an attorney judge what is reasonable? Should algorithm selection be included in discovery negotiations? Could this be another point of disagreement between opposing parties?

10 Questions

11 Supplement

12 Workflow Sampling (in some workflows) Seed set Training
Validation (in some workflows)

13 Technologies: LSI/LSA
Latent Semantic Indexing/Analysis Find relationships between: Words - words Words - topics Topics – documents Map into semantic space

14 Logistical Regression
Just like linear regression except we fit a curve instead of a line. Probability Of Relevance

15 Technologies: Bayesian Probability
Bayesian Probability/Naïve Bayes Probabilistic Identifies probability that a word contributes to a document matches a category, based on examples Each word contributes independently to likelihood

16 Technologies: SVM Support Vector Machine
Process for making binary decisions Documents mapped based on word count expressed as percentage of words in the document As user identifies responsive and non responsive examples, a dividing line is determined

17 Other Lexical Techniques
Rely on linguists and dictionaries Linguists serve as experts and work with attorneys Deconstructs language into parts of speech Determine classification rules for responsiveness and non-responsiveness based on key words May or may not involve machine learning


Download ppt "The Next Frontier in TAR: Choose Your Own Algorithm"

Similar presentations


Ads by Google