Presentation is loading. Please wait.

Presentation is loading. Please wait.

D. G. Fautin and R. W. Buddemeier University of Kansas: Department of Ecology and Evolutionary Biology, Natural History Museum, Kansas Geological Survey.

Similar presentations


Presentation on theme: "D. G. Fautin and R. W. Buddemeier University of Kansas: Department of Ecology and Evolutionary Biology, Natural History Museum, Kansas Geological Survey."— Presentation transcript:

1 D. G. Fautin and R. W. Buddemeier University of Kansas: Department of Ecology and Evolutionary Biology, Natural History Museum, Kansas Geological Survey Biennial gathering – Urbana, IL 23 September 2004

2 ADDING A GEOSPATIAL COMPONENT TO A TAXON- CENTERED DATABASE OR HOW ONE PEET DATABASE GREW LIKE TOPSY

3 from the literature  few museum data to mine  data flows to museums ~ 1000 valid species ~ 1500 species names > 1000 type lots

4 Linking synonyms (through an application developed by Adorian Ardelean) allows all information for a species to be linked, regardless of the name used – but does not display data for homonymous species

5 original photomicrographs of type material illustrations from original descriptions original photos of type specimens IMAGES

6 Occurrence records displayed on a map use symbols of a different color for each synonymous name. This function can be used for investigating whether a synonymy is justified.

7

8 Individual, single-purpose databases are great BUT CREATING METADATABASES – BY LINKING THEM TO OTHERS -- INCREASES UTILITY, ACCESSIBILITY, and EFFECT

9 OCE 00-03970 allowed Expansion to all hexacorals (including reef- forming corals, black corals, tube anemones) Interactivity with environmental data

10 Biogeoinformatics of Hexacorals (http://www.kgs.ku.edu/Hexacoral/) An on-line information resource system that consists of two interactive databases one dealing with taxonomy and biogeography of hexacorals (sea anemones and their allies) one dealing with environmental information for the marine environment served by a front end that links them, and offers user support for searching, analyzing, and downloading the data

11 Biennial gathering – Urbana, IL 23 September 2004

12 An example of technology transfer and integration – and an illustration of issues and needs: The environmental database, tools, and experience developed in and for the LOICZ Biogeochemistry program ---- --- has been combined with the taxonomic biogeographic database structure developed with NSF-PEET support to create the NSF-funded OBIS project, “Biogeoinformatics of the Hexacorallia” Similarity of needs, issues, and users permits rapid progress without wheel reinvention

13 Data users and stakeholders Disciplinary scientists and environmental managers Are generally acutely interested in the nature and quality of the primary data and the availability of supporting (e.g., environmental/climatic) information AND There is a widespread and critical need for convenient, consistent access to environmental INFORMATION to support interpretation of primary biological or (e.g.) chemical DATA. Commonly have relatively little interest or skill in searching for, processing, and critically interpreting datasets outside of their primary fields of training/interest.

14 The approach evolved by LOICZ --- Standardized tools and information to explain goals and benefits and make diverse existing datasets intercomparable Internet-based access to the background information, tutorials, tools, and results A series of workshops (funded) to enlist and train users and contributors, acquire data, and develop products, and test/refine tools and approaches An environmental database that provides easy visualization, manipulation, acquisition and analysis of multiple relevant variables presented in a consistent format on a global scale The scale (30’) and detailed contents represent compromises in the interests of global coverage, ease of use, and provision of integrated capabilities.

15 Envirodatabase ---- Oracle/Coldfusion A single data table with auxiliary label and management tables. World gridded into 259,200 half-degree cells – Inland, Terrestrial CZ, Coastal (shoreline), Ocean-I CZ, Ocean-II, Ocean-III. 219 variables, of which 92 are “selected” – oceanic, atmospheric, geomorphic, terrestrial, ‘human dimension,’ special applications. Relevant features: Selectable geographic regions, Can accommodate occurrence data directly at the 30’ scale, Responds to occurrence locations, Internet links to and from external applications (OBIS, others), Extensive and growing inventory of data characterization/manipulation tools

16 Envirodata access: Select region by lat- long values or predefined zone; select cell type Select variables by class, cell type, with access to variable and source metadata – cell-level measures of spatial and temporal variability are included

17 Selected data can be reviewed, filtered, edited, transformed, statistically analyzed, downloaded, or sent to the clustering site. Review and adaptation ---

18 Example For a selected variable (here, SeaWifs ocean color -- chlorophyl-a band) and geographic region, cell-based summary statistics and a user-controlled histogram display provide non-spatial data visualization, and permit testing the effects of data transforms. Also available – multi- variable correlation matrix, scatterplots.

19 Lessons learned from LOICZ and early Hexacoral – needs and priorities 1. The user interface is of paramount importance; the finest data in the world are useless if people cannot readily access them, understand them, and adapt them to their individual needs. User community -- identify and develop User testing and feedback User support with ultimate participation 2.The geographic and environmental context is what people want to know about and use; why else do we georeference? Resolution, precision, and accuracy (reliability) are among the most important information to convey about records – these must be quantified or classified, not edited out of existence. Users will inevitably want to manipulate and apply data in unforeseen ways – complete access combined with convenient tools for visualization and manipulation are the keys to a successful information facility.

20 Search includes all synonyms; Environmental analyses consider only georeferenced entries A data summary over all cells containing a taxon record and the specified data (null values are dropped) is returned Linking biological and environmental data within “Hexacorals” and with external clients

21 Biogeography: issues of scale and data The example of Macrodactyla doreensis (a tropical sea anemone with fish and algal symbionts) Immediate questions – when and where were they observed? What were conditions (environment, ecosystem) then/there? How well do we know? Extended questions (temporal scale) – what “conditions” are important? (the organisms have decade-century lifespans, so variability and extreme events are as important as averages and point measurements) -- how have conditions changed? (occurrence reports span ~170 years!)

22 Macrodactyla doreensis – sparse distributional data Point occurrence data convey very different impressions and types of information than do generalized range maps – how to draw the polygon?

23 Real Biogeography : Range, distribution controls Geographic circumscription provides a visual clue – but not much more What are the features common to the observed sites? (requires common, consistent database and/or working with incomplete data) What are the biological associations? Connectivity? (requires multiparameter models, visualization – the OBIS interoperability domain) Where else might they be found? (do we want to maximize search success or minimize exclusion – field projects or invasion concerns?) Data verification – (e.g., bathymetry tests locational precision, since photosymbionts limit depth to <~30 m)

24 Geospatial clustering tools -- DISCO and WLV, developed by Prof. Bruce Maxwell and students, are served from Swarthmore College. Closely linked with the Hexacoral environmental database Support a wide range of statistical analyses and visualizations Scale-independent Successfully applied to questions in biogeography, biogeochemistry at continental to global scales, water resource management on a square-mile grid, and social sciences Tools for Biogeographic analysis Dynamic mapping tools – KGSMapper, developed at KGS, is served through OBIS, Hexacorallia, FishBase and, CephBase. Uses online GIS to provide immediate links between occurrence data and environmental database Statistical analyses permit range predictions and biogeographic analyses An advanced prototype is being used to develop additional research tools

25 Geospatial Clustering: Cha et al. (Hydrobiologia, in press) used environmental clustering (below right and at bottom) and shallow-water anemone distribution records to test and modify a proposed biogeographic zonation for Korea (below left). Cluster analysis suggested different affinity groupings in the southern coast and island regions.

26 Simple statistics are used to identify areas of “core” environmental values and outliers

27 Zoom and pan functions permit viewing regions at any level of magnification (see example of Madagascar, below), and also function as data selection tools – only the points in the map view are included in the next analysis update. The prototype mapper permits display and comparison of two datasets (anemones in green, anemonefish in yellow), and also identification of conventionally assigned (red circle) or erroneous (yellow circle) locations.

28 A point-query feature enables the user to identify both specimens and the associated environmental variable values. Individual specimens can be edited out of the dataset, and taxon information can be accessed The environmental data can help identify points outside the known habitat limits of the organisms – for example, the anemones and anemone fish are shallow-water organisms, so any location with a minimum depth > 100 m is highly suspect.

29 Above: Anemone and Anemonefish combined dataset analyzed for ranges without editing. Below: Analysis with data filtered to include only depths <100 m. The 1 std. dev. range interval remains a good predictor even without editing. Data filters on the variable selection tool permit the user to define the scope of the analysis in environmental space as well as in geographic and taxonomic space. The tools permit effective use of datasets with points of mixed quality, and also provide a basis for evaluating or cleaning derivative datasets for specific uses.

30 Informatics – Lessons (often obvious in the abstract) Learned (usually over and over) Appropriate questions and achievable answers change with scale -- often dramatically Common, accessible resources and tools may meet nobody’s highest standards, but are essential to shared progress in a larger community If you can’t see it, it’s hard to talk about it Uncertainty (accuracy/precision) is an essential consideration, not a dirty little secret The best is the enemy of the good: deferring answers to the next generation is easy and unproductive – useful results with available tools and information is a better challenge

31 University of Kansas: Department of Ecology and Evolutionary Biology, Natural History Museum, Kansas Geological Survey Biennial gathering – Urbana, IL 23 September 2004

32 a distributed information system of systematic, ecological, and environmental data the information component of the the source of marine data for An On-line Atlas of Marine Diversity

33 *** www.iobis.org *** provides online access to: Species distribution records of high taxonomic quality Tools for effective research, management, and education data requests and searches network tools and models research and education center ThePortal requires taxonomically and geospatially resolved records (currently 5 million) ^

34 data served/managed by “owner” issues of credit error uses GBIF standards

35 OBIS Search for Actinia equina

36

37 Click on map over Hawai’i -- 10° pixel

38 Environmental Assessment and Siting of Protected Areas Where are endangered species? What areas are important breeding grounds? Which areas are more diverse?

39 1872-1876 University of Kansas Digital Library Initiative to DGF, R. W. Buddemeier, S. Goodwin Thiel, with collaboration of J. Wood

40 Sea anemones were collected from about 31 of 504 stations This will become a tool for OBIS, and is being extended to other expeditions – so data from multiple databases will interact AND entering station data will not be needed RESEARCH OBJECTIVE: REASSEMBLE NET CONTENTS

41 Stations are searchable by number, date, location, and in two map forms – scanned and hot- linked images of original charts, and ArcIMS to provide data on environmental variables from point samples and other sources Prototype example

42 Data recorded for each station are linked to user-selectable data (>200 variables) on recent environmental conditions gridded in register at 30’ = 24.9 o C

43 To test for quality and consistency of both and provide temporal and spatial environmental connections RESEARCH OBJECTIVE: COMPARE EMPIRICAL WITH MODELED DATA -- Reynolds 2 o (1854-2002) and Hadley Centre 1 o (1871- 2002) reconstructed monthly SST averages include the Challenger years

44 BILATERAL INTERACTIVITY

45 AKA NEMO

46 Interoperation allows users to o obtain and interrelate more data o analyze those data using more tools o formulate and address broad-scale questions avoids duplication of effort in database entry provides a double-check on data accuracy (aids in detecting errors, inconsistencies) and thereby improves data quality increases accessibility and reaches a broader community, bridging bio- and geo-informatics

47 Data in individual databases can be repurposed TAXONOMIC AUTHORITY FILES Integrated Taxonomic Information System (ITIS) Species2000 = CATALOGUE OF LIFE DISCREPANT NAMES HUGE ISSUE

48 DISCARD DATA, USE STRUCTURE

49 Photograph by George Miller Biogeoinformatics of Hexacorals www.kgs.ku.edu/Hexacoral

50 Predicting Climate Change (Biogeoinformatics of Hexacorals) Triage -- rationing resources in crisis response. Focus on the least threatened and/or damaged OBIS-LOICZ

51

52 Note to Daphne slide – Following are set up so I could use them to give a talk that I think fits in with your intention – but edit or revise as you see fit (content, order, format, animation…..). If we wind up with too many I could trim 2-3 out fairly easily (mostly by condensing the text slides and making key points more concise). Your two draft sides immediately following my insertions are redundant with things I put in, but I left them because we may like those images or that organization better. Some fine tuning of text in the text slides (purposes, approaches, lessons learned, etc. is probably appropriate at the end of the general construction. Do we want transition slides to signal the changes of speakers and review topic/outline progress (I.e., here and at end of the insert section)? On the review and adaptation slide, I could find or generate an image with one of the dropdown filter menus shown.

53 For taxa with georeferenced records, a query of the companion global 30’ environmental database produces summaries of general environmental conditions for individual entries (below) and a summary (above) for the taxon plus statistics

54 A point query function provides access to both specimen information and data on associated environmental variables (at right).


Download ppt "D. G. Fautin and R. W. Buddemeier University of Kansas: Department of Ecology and Evolutionary Biology, Natural History Museum, Kansas Geological Survey."

Similar presentations


Ads by Google