Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013.

Similar presentations


Presentation on theme: "Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013."— Presentation transcript:

1 Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013 Boston, USA

2 RM072911 Content Management: the traditional view 2 Extreme Content Management “The set of processes and technologies that support the collection, managing, and publishing of information in any form or medium” – Wikipedia More generally: storage, processing, retrieval and disposal of digital content such as text documents, multimedia files, etc., where usually certain types of workflows in the document lifecycle are involved. As long as volume, content acquisition speed, classification complexity and retrieval process are kept within reasonable limits, we have a good solution…

3 RM072911 The world’s information is doubling every two years We broke the zettabyte barrier already Sifting through huge volumes of data efficiently creates new challenges And big stores can be hard to manage and slow The traditional TF/IDF approach no longer works Evolution and new requirements: a wider and semantic World 3 Extreme Content Management Source: IDC/EMC

4 RM072911 Enterprise Content Management: evolved from the necessity to have appropriate control over internal enterprise digital content Big Data technologies: born from the necessity of integrating large amounts of diverse data, prioritizing functionality over discipline ECM provides control, search, traceability, workflows, user interfaces Big Data brings large-scale data integration, semantic analysis, recommendations, contextual search Content Management and Big Data Technologies: a dichotomy 4 Extreme Content Management

5 RM072911 What if your content management system was able to recommend content based on your interest, recent activity or your affinity with related people? How about ingesting petabytes of data and still being able to effectively integrate it, by resolving and disambiguating entities and creating relationships that can enhance retrieval capabilities? What if you could express your queries in natural language? What if the system could present to you interesting content based on your behavior? Let’s dream for a moment… 5 Extreme Content Management

6 RM072911 Data and analytics-based solutions (part of Reed Elsevier, together with LexisNexis, Elsevier, Reed Business Information and Reed Exhibitions) $1.4B Dollars in revenue Insurance, Financial Services, Law Enforcement, Healthcare, Retail, etc. Usually Hundreds of Billions of Records and Trillions of individual attributes on Petabytes of data Search and retrieval, massive graphs, analytics and statistical modeling Designed our Big Data system in the late 90’s Distributed platform for data processing and real-time delivery, with a declarative dataflow programming paradigm (ECL) Released the HPCC Systems platform and ECL as an Open Source project in 2011 What we do at LexisNexis Risk Solutions 6 Extreme Content Management

7 RM072911 Our culture: A DIY culture, with a large technology group (850+ people) But we understand our core competencies and don’t want to reinvent the wheel We love open source, not because it’s cheap but because it’s free (free beer vs. freedom of speech) and we can fix and enhance it ourselves Enterprise Content Management: Supporting human based document oriented workflows for certain products and enterprise services Usually deeply customized with integration and automation (integrated document routing and approvals, automated escalation process, interoperable with our specific products, etc.) Originally based on EMC Documentum, migrated to Alfresco several years ago with overhauled functionality ECM in LexisNexis Risk Solutions 7 Extreme Content Management

8 RM072911 A look at the LexisNexis HPCC Systems Big Data Platform 8 Extreme Content Management

9 RM072911 Alfresco and the HPCC Systems platform have similar business models: both are Open Source (LGPL and Apache, respectively) and offer commercial licenses with support, maintenance, etc. Alfresco is a top notch and flexible Content Management System HPCC is a robust and proven Big Data platform, currently in use in other semantic stores (for example, the recommendation system for Elsevier’s Science Direct) We have experience in both… Alfresco + HPCC: an integrated solution to semantic information management 9 Extreme Content Management

10 RM072911 Implementation 10 Extreme Content Management Thor Co-download matrix Similarity Attribute Ranking Billions of events Millions of documents Roxie

11 RM072911 Export data and metadata from the Alfresco document store Export usage logs from Alfresco Export user information from Alfresco Extract feature vectors from the document data and metadata in Thor Analyze behavioral patterns and similitudes across users Create distance vectors for users and documents and generate the actual recommendations Provide real-time ranked recommendations from Roxie, as users search and browse content Recommendation generation process 11 Extreme Content Management

12 RM072911 Significant human efficiency gains, moving from multiple cycles of “search, wait and pray” to content proactively pushed and much smarter search abilities Specific functionality is now better customized to particular groups (content and behavior driven) Scalability to much larger content repositories without increased retrieval latencies Streamlined the data ingest process, since diverse sources are all managed and integrated through HPCC Proper handling of near real time data updates and streaming when needed Ability to tap into a large library of Natural Language Processing and Machine Learning algorithms on HPCC Additional visualization and Exploratory Data Analysis capabilities when appropriate Business Impact in LexisNexis Risk Solutions 12 Extreme Content Management

13 RM072911 Contextually Relevant Content (semantic search and browse) Minimal Searching Minimal Next Steps Enable all devices Knows me, my team, my market, my interests Improved Usability Increased Effectiveness Better Alignment Mobile and Social Future enhancements 13 Extreme Content Management

14 RM072911 Questions? 14 Extreme Content Management Email: Flavio.Villanustre@LexisNexis.comFlavio.Villanustre@LexisNexis.com http://hpccsystems.com http://www.lexisnexis.com/risk Thank you!


Download ppt "Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013."

Similar presentations


Ads by Google