Deutscher Wetterdienst DAR Metadata Catalog Markus Heene, DWD

Slides:



Advertisements
Similar presentations
Introductory to database handling Endre Sebestyén.
Advertisements

A Roadmap of Open Source components for GI Web Services and Clients A Paul R Cooper MAGIC.
Schedule of Releases (since Tromso meeting) and New Access Interfaces.
Refeng Wu CQ5 WCM System Administrator
World Meteorological Organization Working together in weather, climate and water WMO Information System (WIS) Search (with SRU) Timo Pröscholdt (PO-WIS)
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Apache Struts Technology
A Blackboard Building Block™ Crash Course for Web Developers
1 CEOS/WGISS20 – Kyiv – September 13, 2005 Paul Kopp SIPAD New Generation: Dominique Heulet CNES 18, Avenue E.Belin Toulouse Cedex 9 France
Deutscher Wetterdienst Markus Heene, DWD DAR Metadata.
IWay Service Manager 6.1 Product Update Scott Hathaway iWay Software Copyright 2010, Information Builders. Slide 1.
Fedora 3: A Smooth Migration Michael Durbin. The Scenario  New versions of software promise exciting new capabilities and improvements.  They also present.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Caching the MDSPlus Data via Hibernate By Ajith M Jose Comp6703 Project Client: Raju Karia Supervisor: Dr. Henry Gardner (Development of “WebScope”)
G O B E Y O N D C O N V E N T I O N WORF: Developing DB2 UDB based Web Services on a Websphere Application Server Kris Van Thillo, ABIS Training & Consulting.
Europe’s Information Society eContentplus OrléansWP6 1st Coordination and progress meeting Technical information OGC WMS – WFS – CSW.
Next Generation Node (NGN) Technical Overview April 2007.
Configuring web servers and web applications 1. 2 Server configuration vs. application configuration A web server may run several web application Server.
Planned Title: Review of Evaluation of Geospatial Search Allan Doyle.
Stanford University EH&S A Service Oriented Architecture For Rich Internet Applications Sheldon M. Heitz.
J2EE Java 2 Enterprise Edition. Relevant Topics in The Java Tutorial Topic Web Page JDBC orial/jdbc
Web Applications Basics. Introduction to Web Web features Clent/Server HTTP HyperText Markup Language URL addresses Web server - a computer program that.
Working with SQL and PL/SQL/ Session 1 / 1 of 27 SQL Server Architecture.
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Sys Prog & Scripting - HW Univ1 Systems Programming & Scripting Lecture 15: PHP Introduction.
Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news.
Linux Operations and Administration
Overview of SQL Server Alka Arora.
Crystal Hoyer Program Manager IIS Team Preview of features that will be announced at MIX09 Please do not blog, take pictures or video of session.
From Creation to Dissemination A Case Study in the Library of Congress’s use Open Source Software DLF Spring Forum Corey Keith
Introduction to J2EE Architecture Portions by Kunal Mehta.
The DSpace Course Module – Upgrading from 1.4 to 1.5.
Web Indexing and Searching By Florin Zidaru. Outline Web Indexing and Searching Overview Swish-e: overview and features Swish-e: set-up Swish-e: demo.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
Introduction to Nutch CSCI 572: Information Retrieval and Search Engines Summer 2010.
Deutscher Wetterdienst DWD GISC Node Current Status.
FGDC and GOS Metadata: Foundations to Build the NSDI Sharon Shin FGDC Secretariat / Geospatial One-Stop.
Deutscher Wetterdienst DWD GISC Implementation GISC Development Team.
WDC-MARE – World Data Center for Marine Environmental Sciences Data portal based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler,
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
1 NODC Geoportal Server Yuanjie Li & Jefferson Ogata.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
H. Thiemann (M&D) / / 1 Hannes Thiemann M&D Statusseminar, 22. April 2004.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Lifecycle Server XM Edition. XM Edition Features Full Oracle and SQL Server Support –Oracle & –SQL Server 2005 Improved XML import/export.
DSpace - Digital Library Software
Java Message Service (JMS) Web Apps and Services.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
DataGrid is a project funded by the European Commission EDG Conference, Heidelberg, Sep 26 – Oct under contract IST OGSI and GT3 Initial.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
Presented by: K.AMARNATH Ht.no:10841f0045 Guided by: T.Suneetha.
GeoNetwork OpenSource: Geographic data sharing for everyone
Markus Heene, DWD Software for WIS Markus Heene, DWD
DWD GISC Node Current Status
Recipes for Use With Thin Clients
AMGA Web Interface Salvatore Scifo INFN sez. Catania
Building Search Systems for Digital Library Collections
Introduction to J2EE Architecture
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Service Metadata Registry (COSMOS)
A technical look at the new capabilities
SDMX IT Tools SDMX Registry
Presentation transcript:

Deutscher Wetterdienst DAR Metadata Catalog Markus Heene, DWD

Page: 2DAR Metadata Catalog Agenda  Welcome  Notes  Performance Test - Infrastructure  High level architecture –Geonetwork –terraCatalog  Performance Tests –Requirements –Preconditions –Results –Remarks  Resources

Page: 3DAR Metadata Catalog Notes  The presented results are from May 2009  Both software solutions have released newer versions –Geonetwork 2.6 –terraCatalog 3.0  The findings of the Performance Study were made available to both

Page: 4DAR Metadata Catalog Performance Test - Infrastructure Tomcat 5.5 Oracle 10g Test client Application Server: CPU: 4 AMD Opteron 1800 MHz RAM: kB

Page: 5DAR Metadata Catalog Geonetwork: High level architecture  Geonetwork (version 2.2 and 2.4) –Servlet Container Main development for jetty (migration to other Servlet containers like Tomcat, OC4J possible) Geonetwork consists of 3 different web applications which could interact Different Frameworks used for the development: Jeeves, Struts, Spring, … For the next generation of Geonetwork a system architecture redesign is announced: remove Jeeves Framework (“Bringing data and metadata closer together”, FOSS4G Cape Town by Jeroen Ticheler) –Metadata handling Metadata XML file is stored as “large object” in Database (support for different vendors) Search is mainly based on lucene index outside of Database limited to varchar2(250) in basic installation Huge time necessary to build lucene index –Additional remarks Open source software Stable solution so far (migration to other Servlet container needs time) Version 2.2 implements only some queries of CSW Some Z39.50 support is available, currently only limited experiences inside DWD Production installation with up to records are running (what we found)

Page: 6DAR Metadata Catalog terraCatalog 2.3: High level architecture  terraCatalog 2.3 –Servlet Container Main development for Tomcat (migration possible but not tried) terraCatalog consists of different web applications which could interact Consistent usage of frameworks through all web applications –Metadata handling Metadata XML file is stored in Database and “mapped” into relational model (database support for Postgresql and Oracle) Search is function of Database (Oracle Spatial and Text) Mapping into relational model cause conflicts with XML documents (e.g. title is limited to varchar2(255), same for abstract and keywords)  valid ISO-conform XML documents could not be imported into terraCatalog Oracle Spatial datatype could store only half of the world  special treatment necessary for whole globe  we found Oracle errors in certain situations –Additional remarks Commercial software with support Much more complete implementation of CSW compared to Geonetwork 2.2 No Z39.50 search functionality  additional investment necessary Production installation with up to records are running We found some bugs – SQL Injection, Oracle errors, import of valid XML documents not possible, error in export metadata as XML document

Page: 7DAR Metadata Catalog Performance Tests - Requirements  Requirements based on WMO and INSPIRE  WMO (see WIS-TechSpec-8, DAR Catalogue Search and Retrieval, Technical Specification 1.1) –Response time < 2 sec –40 combined search (keyword and bounding box) per second –Minimum of 20 active sessions  INSPIRE –Response time < 3 sec –Minimum of 30 active sessions  DWD –Minimum of metadata records

Page: 8DAR Metadata Catalog Performance Tests - Preconditions  Importing Metadata –Practical package size was metadata records in an archive –Import costs a lot of time (5.000 records ~ 45 minutes – 60 minutes) –Importing metadata into terraCatalog generates GBs of redo-logs (200 MB per minute)  Formulate queries in CSW –Challenge was to describe a query that both system understood (limited CSW implementation from Geonetwork 2.2) –Parameterize query for different result sets (e.g. search title for “zyx”  0 hits, search title for “gts”  hits)

Page: 9DAR Metadata Catalog Performance Tests - Results + (fulfilled),- (failed),o (partially)

Page: 10DAR Metadata Catalog Performance Tests - Results INSPIRE WMO + (fulfilled),- (failed),o (partially)

Page: 11DAR Metadata Catalog Performance Tests - Remarks  Geonetwork fails to meet the requirement if the result set contains more than hits (  response time scales with size of the result set)  Geonetwork installation with metadata records –First access of the GUI takes minutes!  Geonetwork 2.2 deployment of web app with around 3000 metadata records costs hours  terraCatalog fails to meet the requirement for combined searches  terraCatalog could not meet the response time requirement for geographical searches  terraCatalog errors if the search touches the equator  Fuzzy search for title, abstract, keywords … is a nice feature  terraCatalog up to 60 times faster as Geonetwork in simple queries  Other solutions like geowaySDI.NODE are although tested only with records  Currently it looks like that both systems are not capable to handle metadata records according to the requirements of INSPIRE and WMO

Page: 12DAR Metadata Catalog Resources  WMO Wiki: index.php?page=geonetworkdochttp:// index.php?page=geonetworkdoc  Geonetwork:  BlueNet:  con terra:

Page: 13DAR Metadata Catalog Q&A