Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group.

Slides:



Advertisements
Similar presentations
Instant JChem - current status and what's coming soon. Tim Dudgeon Solutions for Cheminformatics.
Advertisements

Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
SharePoint Forms All you ever wanted to know about forms but were afraid to ask.
Edoclite and Managing Client Engagements What is Edoclite? How is it used at IU? Development Process?
15 Chapter 15 Web Database Development Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel.
© 2005, Cornell University. Rapid Application Development using the Kuali Architecture (Struts, Spring and OJB) A Case Study Bryan Hutchinson
1 Integration and Extension Hohmann Chapter 8.
Product Offering Overview CONFIDENTIAL AND PROPRIETARY Copyright ©2004 Universal Business Matrix, LLC All Rights Reserved The duplication in printed or.
Automating your Business Processes Using Oracle Workflow Therron Hofsetz Logical Apps, Inc.
Midwest Documentum User Group Harley-Davidson Documentum WCM 10/10/2006.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
SQL Server to MySQL Database Migration SQLWays - Migration Software Presentation March 2009 Copyright (c) Ispirer Systems Ltd.
WorkPlace Pro Utilities.
Database Design for DNN Developers Sebastian Leupold.
GIS Concepts ‣ What is a table? What is a table? ‣ Queries on tables Queries on tables ‣ Joining and relating tables Joining and relating tables ‣ Summary.
Denise Luther Senior IT Consultant Practical Technology Enablement with Enterprise Integrator.
Databases C HAPTER Chapter 10: Databases2 Databases and Structured Fields  A database is a collection of information –Typically stored as computer.
Joomla! Day France SEBLOD Version 2.0 for Joomla! 1.6.
1 Keith Vicens, Managing Consultant CRM Housing Solution Extending Your Case Management Capabilities.
San Diego 2014 SharePoint Saturday San Diego November 15, 2014 UCSD Extension SharePoint Saturday San Diego November 15, 2014 UCSD Extension.
Presented By: Steven Chenery Chief Executive Officer.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Presented By: Steven Chenery Chief Executive Officer.
The most powerful high-speed scanning, indexing and OCR solution on the market Supports many high speed scanners: Fujitsu, Canon, Kodak, Epson, Avision,
Presented By: Steven Chenery Chief Executive Officer.
OFC335 Microsoft Office Word 2007 XML Programmability: True Data/View Separation and Rich Eventing for Custom XML Tristan Davis Program Manager Microsoft.
XRules An XML Business Rules Language Introduction Copyright © Waleed Abdulla All rights reserved. August 2004.
© 2011 Autodesk High-End Infrastructure Modeling with Low-Cost Tools: Introducing AutoCAD® Map 3D 2012 Bradford Heasley, GISP Vice President, Brockwell.
DataMAPPER - Applied Database Tech. 이화여대 과학기술대학원 석사 3 학기 992COG08 김지혜.
What is TrinDocs A fully integrated document management system enabling: Archiving Instant Retrieval Workflow & Routing OCR and Intelligent Form Recognition.
ModelPedia Model Driven Engineering Graphical User Interfaces for Web 2.0 Sites Centro de Informática – CIn/UFPe ORCAS Group Eclipse GMF Fábio M. Pereira.
| Banner XtenderSolutions David Cheney SunGard Higher Education.
Copyright © 2012 UNICOM Systems, Inc. Confidential Information z/Ware Product Overview illustro Systems International A Division of UNICOM Global.
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
ICM – API Server & Forms Gary Ratcliffe.
How to combine IRIS products Available APIs Examples of integrations Ole Andersen Senior Strategic Account Manager.
Introduction to EJB. What is an EJB ?  An enterprise java bean is a server-side component that encapsulates the business logic of an application. By.
Open source administration software for education next generation student system I Did Not Know You Could Do That With An SIS: How To Make Kuali Student.
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Platinum DecisionBase1 DW Product Platinum - Computer AssociatesDecisionBase Hyunsook Lim Database Laboratory Dept. of CSE.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
ARCHIBUS, Inc. COBie Data Connectors Gary Siorek, Technical Applications Engineer 2013 COBie Challenge for Facility Managers112-Mar-2013.
Connecting to External Data. Financial data can be obtained from a number of different data sources.
Integrating and Extending Workflow 8 AA301 Carl Sykes Ed Heaney.
#SummitNow Building a Quick Solution with Alfresco Workdesk 13. November 2013 Richard McKnight - Alfresco Christian Finzel - Alfresco.
Comparison of The Workflow Management Systems Bizagi, ProcessMaker, and Joget Mohamed Zeinelabdeen Abdelgader [1], Omer Salih Dawood [2], Mohamed Elhafiz.
Can you do this in SmarTeam?
Representing Tabular Data in Alfresco Share “Smooth Like Butter” Gary Cox Blue Fish Development Group.
Overview of Basic 3D Experience (Enovia V6) Concepts
#SummitNow Capturing your Content November 14, 2013 Pat Myers – Zia Consulting.
Migrating from Legacy ECM Repositories to Alfresco Ray Wijangco Technology Services Group Alfresco Practice Lead.
MarkLogic The Only Enterprise NoSQL Database Presented by: Aashi Rastogi ( ) Sanket Patel ( )
Enterprise Library 3.0 Memi Lavi Solution Architect Microsoft Consulting Services Guy Burstein Senior Consultant Advantech – Microsoft Division.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
By selecting rows in the “first” data table...
Dispatcher Phoenix Is…
The effort-saving, cost-cutting, low-overhead, cloud capture platform.
Microsoft Dynamics.
Transact™ Mobile SDK Quickly bring capture-enabled mobile applications to market with open-ended backend integrations.
GlobalCapture® Convey
The Re3gistry software and the INSPIRE Registry
GlobalCapture® Convey
GlobalCapture® Convey
GlobalCapture® Convey
What's New in eCognition 9
GlobalCapture® Convey
What's New in eCognition 9
What's New in eCognition 9
Presentation transcript:

Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group

Introduction This talk will discuss a way to manage tabular data in Alfresco We are going to use the Ephesoft scanning solution as the source of our data Our use case will be a scanned invoice – but this could apply to many types of tabular data

About Blue Fish! Alfresco Platinum Partner Focused on ECM for 15 years Document Management Solutions Web Content Management Solutions Scanning and Indexing Systems Custom Application Development Content Migrations Global 2000 and Fast Growing Mid-Market Companies We Believe Companies Deserve to Have it Done Right the First Time Unique Business Model Fixed Price Projects Reusable IP Library The Blue Fish Guarantee

What is Tabular Data? A set of related data fields Typically grouped into columns and rows Many examples out there, but today we will use an itemized invoice as our example

Our Challenge: Take Invoice Data

Our Challenge: Import to Alfresco

1 Minute Tour of Ephesoft! Ephesoft features “Smart Capture TM ” Provides a way to automatically classify and separate documents based on content Can extract data to fields using OCR Can be fully automated, or documents can be reviewed and validated by human users when needed

1 Minute Tour of Ephesoft! Data can be exported a variety of ways, we will discuss CMIS Export Uses a workflow engine for managing the state of each batch being processed Data about the batch is maintained in XML Important for us: Each module in the workflow has a plugin for custom Java code!

1 Minute Tour of Ephesoft! Simple fields can be automatically extracted using OCR data and rules for each field. They can then be validated if needed by a human users

1 Minute Tour of Ephesoft! Table Data can be also be extracted, by rules, from the OCR layer. Again, a human user can validate data if required

Ways to Model Table Data - 1 What are some ways we might model the data?  “Brute Force” Approach – Create individual properties for every possible field in the table  Correlated Repeating Attributes – Use Alfresco’s support for multiple values on properties and correlate them by index position

Ways to Model Table Data - 2  Hierarchical data – parent-child relationships  Custom data structure using JSON or XML, using jqGrid and jQuery in Share for generating the tabular control

Our Choice: Structured JSON Data Pros “Extending” existing datatypes Greatly reduces server side customizations Removes issues of managing transactions for hierarchical objects Can leverage the out of the box features of Alfresco content objects Lifecycle Node Operations Cons Custom API required for extracting data Some limitations to search functionality Does require custom Share form controls Size limits on text fields in the RDBMS (varies by DB)

Example d:text Alfresco Property [ { "qty": "1", "item": "A123", "description": "Supreme Office Chair", "unitPrice": "$250.00", "discount": "$0.00", "lineTotal": "$250.00" }, { "qty": "5", "item": "S555", "description": "Red Staplers", "unitPrice": "$15.00", "discount": "$0.00", "lineTotal": "$75.00" } ]

Our Use Case is Easy, Right??? 1.Scan the document 2.Import the document into Ephesoft 3.Automatically classify document to a type 4.Extract fields automatically based on document type 5.Validate data 6.Before export, convert table data to JSON in ScriptExport hook 7.CMIS Export to map Ephesoft fields to Alfresco properties and export document 8.Convert raw JSON to a table for viewing 9.Controller for reading and writing of data

Not as Bad as It Sounds Ephesoft already provides most of what we need Just requires custom logic in their “ScriptExport” Java hook to convert their XML table data to our JSON structure JSON string is mapped to a hidden simple field in Ephesoft which is in turn mapped to an Alfresco property

Not as Bad as It Sounds In Alfresco, JSON string is persisted in d:text property Just need: A controller for reading/writing JSON A view for users to read and edit table data jqGrid and jQuery make this easy

Table Architecture in Alfresco

The Data In Alfresco Standard FTL controls in Alfresco Share are used for the ‘simple’ data fields The “d:text” property for the invoice data is backed by a custom FTL that reads an XML schema and calls the custom JavaScript controller

Wrapping Up This custom solution allows for exporting tables from Ephesoft to Alfresco It also provides a generic way to manage tabular data in Alfresco Tables can be viewed and edited by users It utilizes standard extensions in both Ephesoft and Alfresco

Conclusions Integration hooks in Ephesoft and Alfresco make this customization straightforward Extend functionality and use existing libraries instead of starting from scratch Ephesoft and Alfresco can be paired to provide a robust solution for managing document scans

Resources An in depth technical discussion of Tabular Data can be Found Here sessions/smooth-butter- representing-tabular-data-alfresco- share Contact Us!