Presentation is loading. Please wait.

Presentation is loading. Please wait.

The ROI of Analytics Scott W. Lombard, SME Senior Vice President – Litigation Management Rust Consulting Inc.

Similar presentations


Presentation on theme: "The ROI of Analytics Scott W. Lombard, SME Senior Vice President – Litigation Management Rust Consulting Inc."— Presentation transcript:

1

2 The ROI of Analytics Scott W. Lombard, SME Senior Vice President – Litigation Management Rust Consulting Inc

3 Premise Analytics represent the strategic use of technology against a set of data by applying legal and case-specific knowledge against “data”…with the intention of creating “information” clusters (to reduce the amount and identify accurate context) of potentially-proof of claims (or, create confidence and accuracy that you’ve produced all you reasonably can about the “data) to create accurate productions effectively and efficiently

4 Data

5 A Little More Data

6 More Data

7

8 It’s Home

9 Key Issues In 2011 alone,1.8 zettabytes (or 1.8 trillion gigabytes) of data will be created, the equivalent to every U.S. Citizen writing 3 tweets per minute for 26,976 years. (IDC) It’s estimated that 294B s are sent per day (90 trillion per year) (Radicati Group) 188B text messages are sent a month (6.3B per day or 2.3 trillion per year) (CNN)

10 Where Is The Data? 52% of all data is stored on hard disk drives 28% is stored in optical storage 11% is stored on digital tapes 9% is stored in other forms

11 Defining “Analytics” Technology assisted review, computer assisted review, predictive coding, predictive priority and text analytics, etc. Industry buzzwords coined to describe applications of a mathematical modeling technique by which documents are mapped into a concept space according to their textual content. Within a concept space, similarity between documents can be measured by their proximity to one another.

12 The concept space: The Backbone of analytics

13 Fundamental observation: Documents in close proximity are similar in content

14 Fundamental observation: Documents in close proximity are similar in content

15 Intelligent Review Using a seed set of documents coded by humans, the algorithm projects coding decisions across the entire document collection Segregate documents for priority review Elevate potentially responsive documents Reduce priority of likely non-responsive documents Identify potentially privilege set for review Reduce the universe of documents Remove “noise” and/or privilege documents Statistically evaluate confidence level and margin of error to verify results.

16 JUNK RESP SEED SET INTELLIGENT REVIEW

17 JUNK RESP

18

19

20

21

22

23 Non Responsive Hot Background Privilege Clustering: Group documents based on conceptual patterns throughout the document

24 Concept searching Find similar documents within a user defined similarity threshold

25 Concept searching Find similar documents within a user defined similarity threshold

26 Other Applications Assisted keyword generation Submit a keyword of list of keywords, and the algorithm will suggest additional terms appearing in similar contexts across the document universe Near duplicate detection Locate and remove duplicate documents based on content rather than hash value

27 Plaintiffs

28 Added Value More efficient review Remove non-responsive documents Promote potentially responsive documents Able to evaluate ROI Discover more, sooner Quickly locate hot documents through concept searching and clustering Form case strategy early on Difficult to place $ value

29 Techniques Used Clustering Concept searching Intelligent review Assisted keyword generation Statistical sampling to define confidence interval and margin of error

30 Cost Typically a onetime fee based on GB volume E.G. For 100 GB’s of ESI, at $125/GB, enabling analytics would cost $12,500 Hourly fees for analytics expert Less common for plaintiffs Defendants aren’t concerned with the statistical defensibility of plaintiffs’ review May still be useful in verifying results $125-$350/hour

31 Measuring ROI Ask: how does assisted review compare with human review? To measure this, we need to know: C hr = cost of human review ($⁄document) V = cost of assisted review ($⁄gb) C ar = volume of data (gb) N ar = number of documents coded using analytics

32 Case Study Large Anti-Trust matter Review set of 1.9 million documents Client used linear review workflow for first 14 months Estimated another 36 months to complete review Implemented analytics in relativity to accelerate the review process

33 Cost In January, 2013, 46,730 documents were reviewed Manual review –Coding time was hours –At a rate of $25 per hour, the monthly review cost was $30,887, or $.66 per document Total review cost = hrs x $25/hr = $30,887 Cost per document = $30,887/46,730 = $.66 per document Analytics –Priced at $125/GB –Volume: 571 GB’s

34 Minimum to Add Value ANALYTICS WILL HAVE A SUPERIOR ROI OVER HUMAN REVIEW IF IT CAN BE USED TO DEFINE 108,144 DOCUMENTS.

35 Minimum To Add Value

36 5.5%

37 Results In first 30 days: Through concept searching and clustering, 250,000 non-responsive files removed from the review set, over twice the minimum required to add value to the case This set of non-responsive documents could not be otherwise identified using date, file type, NIST or other metadata filtration.

38 Results: 30 Days

39 Savings: 30 days Human review cost: 250,000 x $.66 = $165,000 Analytics assisted review cost : = $71,375 $93,625 SAVINGS

40 12.7% $93,625

41 Additional Benefits Intelligent review identified an urgent review set of roughly 50,000 files, which, after human review, proved to have 10 x more critical documents than unassisted samplings With 30% of the review complete, and depositions fast approaching, clustering and concept searching played a critical role in identifying key communications across the non-reviewed set, doubling the number of documents brought to the first two depos

42 Ongoing Analytics has been seamlessly integrated into the linear review workflow. As reviewers locate examples of responsive, non- responsive and privilege documents, they tag them as potential examples. An analytics expert then submits examples as “seeds”, used by the engine to locate conceptually similar documents across the universe. Likely responsive documents are elevated in priority. When end-reviewers log in, they see these documents first in their list, without additional work. Likely non-responsive documents are demoted in priority or removed from review.

43 Conclusion Analytics has Significantly and measurably accelerated the review Outperformed ROI from full human review Improved the productivity of human review Assisted attorneys in developing a more informed case strategy, through a better upfront understanding of their discovery universe

44 Defense

45 Added Value Pre-Process Remove non-responsive and privilege documents before incurring processing and hosting fees Evaluate case merits/weaknesses early on Post-Process (review database) Prioritize review Privilege candidates Responsive candidates Non-responsive candidates

46 Pre-Process vs. Post-Process

47 Cost Pre-Process Per GB “in” or “out” In: the total volume of data before it is reduced by filtration and analytical culling. Lower per-GB fee Larger volume Out: the total volume of data exiting the ECA platform, and entering the hosted review platform. Higher per-GB fee Lower volume Post-Process Same as plaintiffs

48 GB’s “In” vs. “Out”

49 Measuring ROI V = volume of initial collection C a = cost of analytics ($/gb) C p = cost to process ($/gb) C h = cost to host ($/gb) R a = proportion of documents deemed potentially responsive after analytic filtration (%) T = duration of hosted review

50 Case Study Large IP litigation Initial collection totaled 3.5 TB’s –PST’s –Desktop PC’s –Other media NIST filtration and de-duplication removed 34% of the collection, leaving 2.31 TB’s of potentially responsive data Analytics was implemented after de-NIST/de-duplication but prior to processing Estimate for linear privilege/responsive review phase was 12 months

51 Techniques Used Clustering Concept searching Assisted keyword generation Near duplicate removal Statistical sampling to define confidence interval and margin of error

52 Cost Volume of data set = V = 2.3 tb Analytics cost = c a = $125/gb Processing cost = c p = $250/gb Hosting cost = c h = $25/gb/mo Proportion of responsive (not filtered) data = r a = unknown Duration of review = t = 12 months

53 Minimum to Add Value R A <.77 Analytics will have a superior ROI if it can be used to cull >23% of the initial data set. C A + R A C P + R A C H T < C P + C H T

54 23%

55 Results Using analytics, we were able to reduce the data set to 750 GB’s of potentially responsive data Clustering and concept searching for non-responsive (Junk ), and privilege documents proved most effective R a, the portion of potentially responsive documents after analytics, was 750/2,300 = 32%, meaning 68% of the data was removed, over twice the minimum to add value of 23% Statistical sampling was used to verify accuracy of each “cull criteria” before data was removed

56 Savings Cost of standard model: $250(2300) + ($25)(12)(2300) = $1,265,500 Cost of analytics assisted model: $125(2300) + $550(.32)(2300) = $692,300 Savings $573,200

57 68% $573,200

58 Additional Benefits Using analytics early on, lead attorneys were able to gain an in-depth knowledge of their data set Key “good” and “bad” documents were located Additional custodians and addresses were identified Results were used to defining case strategy early on Promoted cooperation between parties through Transparent reporting Defensible, auditable workflow Statistically measured accuracy

59 Is It Right For My Case? Good s PC’s Removable media Network shares HQ scans Bad Poor quality scans Highly repetitive content (Timecards) Primarily numeric content (financial documents)

60 Recommendations Work with people you can trust Easy to get burned Perform a formal evaluation Service provider may provide this free of charge Education Stay up to date on court decisions Read white papers Schedule a demo Use commercial, off the shelf software. What is “under the hood” can vary widely.

61 Thank You! Scott W. Lombard, SME Rust Consulting Inc Senior Vice President – Litigation Management (Office) (Cell)


Download ppt "The ROI of Analytics Scott W. Lombard, SME Senior Vice President – Litigation Management Rust Consulting Inc."

Similar presentations


Ads by Google