Presentation on theme: "1 Aspire Latest Developments Steve Denny 1."— Presentation transcript:
1 Aspire Latest Developments Steve Denny firstname.lastname@example.org 1
2 Agenda What is Aspire History Version 0.2 Version 0.3 2
3 What is Aspire? You mean you really don’t know????? Document processing framework (and more) Search Technologies intellectual property Currently in use in: Aspermont (hosted service) BNA (hosted service) BBC CPA (document processing & assignee normalization) OLRC WorldCheck (POC)
4 What is Aspire? (cont…) Runs in OSGI Service platform for Java Allows applications to (re)use small components SOA allows dynamic loading/restarts etc We use Felix Container runs components Configured via XML Created as Java objects at run time Can interact and communicate Can be loaded/unloaded/upgraded on the fly We have a library of approximately 38 components 50 sub-types in total Growing by the day – well, project really….
5 History Architected by Paul Nelson Aspermont gave us the opportunity to develop Development started July 2009 Developers – Paul “framework” Nelson (or should that be “refactor”?), Manuel “arch-file” Alfaro, Sundip “CCD” Pradhan & Steve “feeder” Denny System in place by mid September Customer eventually went live in November Initially 19 components Framework, Application & Services, Hot folder feeder, RSS feeder, Single page feeder (Simple feeder) CCD, Arch-file reader, URL Fetcher, Post2Solr Other utility & testing compents Known as Aspire 0.1
7 Aspire 0.2 Snapshot 0.2 followed almost immediately from 0.1 Paul cleaned up some outstanding issues and made things a bit more generic Post2Solr became PostXML Presented & used at the FY10 kickoff Development & bug fixing continued RDB connection stage and Groovy scripting stage Paul kept telling me of his desire for feeders (rdb) and other components I kept promising to do work towards them and then not having Not much else happened until February 2010
8 Aspire 0.2 (BNA) BNA Another hosted project Needed document boosting (in Solr) based on document content More development effort Sundip, Steve & Luis Pintor Text tagger Notes occurrences of words or phrases in the document Index booster Groovy stage to boost the relevance of the document based on the output of the tagger URL Filter Removes documents based on their URL Framework bug Caught Exception, rather than Error, resulted in lost jobs
9 Aspire 0.2 (BBC) BBC Large project sub-contracted via IBM Content in complex database schema More development effort Steve RDBFeeder Based on SimpleFeeder - can you guess what it does? RDBSubJobFeeder & RDBRowExtractor JMSFeeder Connects to a JMS message queue via JNDI Feeds messages in to the pipeline JobErrorHandler Can catch failed jobs (or sub jobs) and resubmit Quarantines after a number of attempts Operator can resubmit
13 Aspire 0.2 (BBC) Framework Changes to accommodate some of the above Various updates and tidying of status pages Ability to stop, reload and start components And component managers
14 Aspire 0.2 All these changes made to 0.2-SNAPSHOT Decision made to release 0.2 Paul and I spent some time working on a repeatable release method Already using Maven, so used its release plug-in Aspire 0.2 released August 6 th 2010 Development now continues with 0.3-SNAPSHOT Development targeted to CPA & BBC (release B)
15 Aspire 0.3 snapshot CPA driving the initial development Luis Luna picking up the baton, with Sahir Pathan at his heels 8 components since beginning of August Lucene library component & query components (3) Hash map (substitution) Zip & Tabular file handling
17 Aspire 0.3 snapshot BBC work on going Who knows what components this will bring? I hope I do soon…. More Future Stuff! More Lucene Functionality: Tokenizers, Query, Index HTML Processing: “Structural Analysis” Removing sidebars, etc. Lucene Connector Framework Connectors to Sharepoint, NFS, Documentum & Document Level Security Wiki is now a good resource Please continue to update it…
18 Questions? Steve Denny email@example.com 18
Your consent to our cookies if you continue to use this website.