Presentation is loading. Please wait.

Presentation is loading. Please wait.

Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group.

Similar presentations


Presentation on theme: "Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group."— Presentation transcript:

1 Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group

2 Introduction This talk will discuss a way to manage tabular data in Alfresco We are going to use the Ephesoft scanning solution as the source of our data Our use case will be a scanned invoice – but this could apply to many types of tabular data

3 About Blue Fish! Alfresco Platinum Partner Focused on ECM for 15 years Document Management Solutions Web Content Management Solutions Scanning and Indexing Systems Custom Application Development Content Migrations Global 2000 and Fast Growing Mid-Market Companies We Believe Companies Deserve to Have it Done Right the First Time Unique Business Model Fixed Price Projects Reusable IP Library The Blue Fish Guarantee

4 What is Tabular Data? A set of related data fields Typically grouped into columns and rows Many examples out there, but today we will use an itemized invoice as our example

5 Our Challenge: Take Invoice Data

6 Our Challenge: Import to Alfresco

7 1 Minute Tour of Ephesoft! Ephesoft features “Smart Capture TM ” Provides a way to automatically classify and separate documents based on content Can extract data to fields using OCR Can be fully automated, or documents can be reviewed and validated by human users when needed

8 1 Minute Tour of Ephesoft! Data can be exported a variety of ways, we will discuss CMIS Export Uses a workflow engine for managing the state of each batch being processed Data about the batch is maintained in XML Important for us: Each module in the workflow has a plugin for custom Java code!

9 1 Minute Tour of Ephesoft! Simple fields can be automatically extracted using OCR data and rules for each field. They can then be validated if needed by a human users

10 1 Minute Tour of Ephesoft! Table Data can be also be extracted, by rules, from the OCR layer. Again, a human user can validate data if required

11 Ways to Model Table Data - 1 What are some ways we might model the data?  “Brute Force” Approach – Create individual properties for every possible field in the table  Correlated Repeating Attributes – Use Alfresco’s support for multiple values on properties and correlate them by index position

12 Ways to Model Table Data - 2  Hierarchical data – parent-child relationships  Custom data structure using JSON or XML, using jqGrid and jQuery in Share for generating the tabular control

13 Our Choice: Structured JSON Data Pros “Extending” existing datatypes Greatly reduces server side customizations Removes issues of managing transactions for hierarchical objects Can leverage the out of the box features of Alfresco content objects Lifecycle Node Operations Cons Custom API required for extracting data Some limitations to search functionality Does require custom Share form controls Size limits on text fields in the RDBMS (varies by DB)

14 Example d:text Alfresco Property [ { "qty": "1", "item": "A123", "description": "Supreme Office Chair", "unitPrice": "$250.00", "discount": "$0.00", "lineTotal": "$250.00" }, { "qty": "5", "item": "S555", "description": "Red Staplers", "unitPrice": "$15.00", "discount": "$0.00", "lineTotal": "$75.00" } ]

15 Our Use Case is Easy, Right??? 1.Scan the document 2.Import the document into Ephesoft 3.Automatically classify document to a type 4.Extract fields automatically based on document type 5.Validate data 6.Before export, convert table data to JSON in ScriptExport hook 7.CMIS Export to map Ephesoft fields to Alfresco properties and export document 8.Convert raw JSON to a table for viewing 9.Controller for reading and writing of data

16 Not as Bad as It Sounds Ephesoft already provides most of what we need Just requires custom logic in their “ScriptExport” Java hook to convert their XML table data to our JSON structure JSON string is mapped to a hidden simple field in Ephesoft which is in turn mapped to an Alfresco property

17 Not as Bad as It Sounds In Alfresco, JSON string is persisted in d:text property Just need: A controller for reading/writing JSON A view for users to read and edit table data jqGrid and jQuery make this easy

18 Table Architecture in Alfresco

19 The Data In Alfresco Standard FTL controls in Alfresco Share are used for the ‘simple’ data fields The “d:text” property for the invoice data is backed by a custom FTL that reads an XML schema and calls the custom JavaScript controller

20 Wrapping Up This custom solution allows for exporting tables from Ephesoft to Alfresco It also provides a generic way to manage tabular data in Alfresco Tables can be viewed and edited by users It utilizes standard extensions in both Ephesoft and Alfresco

21 Conclusions Integration hooks in Ephesoft and Alfresco make this customization straightforward Extend functionality and use existing libraries instead of starting from scratch Ephesoft and Alfresco can be paired to provide a robust solution for managing document scans

22 Resources An in depth technical discussion of Tabular Data can be Found Here http://devcon.alfresco.com/sanjose/ sessions/smooth-butter- representing-tabular-data-alfresco- share Contact Us! http://www.bluefishgroup.com gcox@bluefishgroup.com info@bluefishgroup.com


Download ppt "Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group."

Similar presentations


Ads by Google