Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vegetation Data Management: VegBank Funding: National Science Foundation (DBI-9906838) January 8, 2002 John Harris - NCEAS.

Similar presentations


Presentation on theme: "Vegetation Data Management: VegBank Funding: National Science Foundation (DBI-9906838) January 8, 2002 John Harris - NCEAS."— Presentation transcript:

1 Vegetation Data Management: VegBank Funding: National Science Foundation (DBI-9906838) January 8, 2002 John Harris - NCEAS

2 VegBank Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program ABI / The Nature Conservancy Project organized and conducted by: Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / ABI Michael Jennings, USGS-BRD & UCSB John Harris, NCEAS

3  Support the National Vegetation Classification.  Provide a comprehensive facility to store the most commonly collected vegetation plot data attributes.  Provide the user with a large number of user-defined attributes to store not-so-commonly collected data.  Integrate plots with the dynamic plant taxonomy and vegetation community data. VegBank Design Goals

4 Project Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation Core elements of the National Plots Database

5 VegBank Plots Database

6 NameTaxonUsage A usage represents a unique combination of a taxon and a name. Usages can be used to track nomenclatural synonyms

7

8

9 ‘www.VegBank.org’ beta release March 2002

10

11

12 Collection Integration Archival Extraction Analysis Publication

13 Development Cycle Preliminary design Build prototype #1 interface User evaluates interface Evaluation studied by designer Design modification are made Build prototype #n interface Supported by 3 other NCEAS Developers Database Design: Aug. 2000 – Jan. 2001 Interface Design: Nov. 2000 – Feb. 2001 Backend Development: Jan. 2001 - Interface Development: Mar. 2001 – Backend Version: Prototype 3 Interface Version: Prototype 1 Expected Beta Release: Late Sept. – Mid Oct.

14

15

16

17

18

19

20 Taxonomy Module Smithsonian meeting: Peet-Taswell model vs Berendsohn model FGDC Biological Nomenclature Working Group Update on ABI & HDMS Prospects for implementation The difficult choice

21  Logical separation of a "taxonomic name" with the "taxonomic concept", so that taxonomic data can be stored at the most 'atomic' level without ambiguity  The ability to incorporate multiple organizations' 'views' of how a taxonomic name is applied to a taxonomic concept  The ability to link a taxonomic name used in the Plots database with a 'name - concept' pair in the taxonomic database. *Although one can store vegetation community data in the same database table-structure as the plant taxonomy database, we have implemented two separate table structures and have created two separate data sets. Taxonomy Database Design Goals

22 Representative tools reflect the desire to have the following features: High performance Robust Open architecture Platform neutral Scaleable Development Choices

23  Java -- Write once, Cross platform – Linux, Windows, MacOS*  Java Servlet -- Dynamic, database-driven, web content  JDBC -- Connect to any database - Oracle, PostgreSql, SQL Server backend  Swing -- Classy interface tools  Beans -- Reusable components Features - JAVA * Not tested yet :-)

24 XML: is the format for structured data on the Web. Simple and flexible data conversions, using XSLT Straightforward to write generic tools which export parts of a relational database as XML encoded data, or even to write generic code that serializes Java (or other) objects as XML data structures. Examples later… Features - XML

25 An Example Workflow Using Wisconsin Plots Data What data integration means to us Taxonomic / Semantic Integration Data formatting for database ingestion General Comments about Current Format Data Parsing Transformation to XML standard Legacy Data Loader

26 Plots Data Integration & DB Ingestion Reformat by Hand Research MS Access MS Excel Perl Shell scripts ? Integration What is meant by data integration? … Plots DB

27 Taxonomic Integration Carya ovata (Miller)K. Koch Carya carolinae-sept. (Ashe) Engler & Graebner Carya ovata (Miller)K. Koch sec. Gleason 1952 sec. Radford et al. 1968 Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies. Integration

28 Semantic Integration of Plot Attributes ‘Basic yet Important’ Integration Cover Scales Strata Dimensions Environmental Attributes

29 Integration Parse Data from Forms into Table Structure to be Transformed into XML Consistent with the Database Structure Text Forms Columnar Tables XML

30 Integration Parsed Data Text Forms Columnar Tables

31 Integration Transform Parsed Data to XML Consistent with the Plots Database Columnar Tables Plots DB XML Legacy Data Loader Data Definition (XML)

32 Integration Data Definition (XML) – Single file siteData.csv ’,’ site data plotCode authorPlotCode 1 communityName 2 …

33 Integration Data Definition (XML) – Multiple files vegData.csv siteData.csv site data authorPlotCode '+' ‘,’ species plotName authorPlotCode 1 scientificName taxonName 2

34 Integration Plots Database XML

35 Existing Prototype Functionality

36 Vegetation Database Client

37 Agenda: Over-Arching Concepts Project Overview · Impact · Database Design · System Architecture · Challenges Use-Case Example: Wisconsin Data Data Management Recommendations Future Directions

38

39

40

41

42

43 Vegetation Desktop Database Client

44

45

46

47

48 Extra slides to follow:

49

50

51 General Data Management Practices general formats weird formats unusable formats modeled the software after the way that people collect plots data -- at least that is what I thought At times tortuous path to the database in terms of reformating class indicies (these are rectified at the plots loading software step)

52 Management Case: Example from Wisconsin  Baraboo Hills -- Collected Yesterday  PEL -- Legacy Data

53 Data Transformation of Forms

54


Download ppt "Vegetation Data Management: VegBank Funding: National Science Foundation (DBI-9906838) January 8, 2002 John Harris - NCEAS."

Similar presentations


Ads by Google