Presentation is loading. Please wait.

Presentation is loading. Please wait.

Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data.

Similar presentations

Presentation on theme: "Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data."— Presentation transcript:

1 Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data discovery and use: Past Present and Future Wed. 10-12pm

2  Background and Past Activities  Finalizing the list – who approves?  Procedures for managing the list  Next steps  Tool development  Keywording  Searching  Hierarchies/polytaxonomys/thesauri/ontologies

3  For past activities, see the report at: and s/Newsletters/DataBits/06spring/ s/Newsletters/DataBits/06spring/  Summary:  Eclectic keywords make searching difficult – most terms are used only once!  No easy way to group or organize similar datasets to facilitate “browse” searches

4  Assembled list of LTER EML Keywords  Cross linked that list to:  NBII Thesaurus Words  GCMD Keywords  Metacat Searchers  Edited  Changed words to preferred forms (kept track of synonyms)  Removed specific places, taxonomic names

5  Selected  Keywords shared with GCMD and NBII, or  Keywords used at more than one LTER site  Reviewed  Removals and additions were suggested  Voting via SurveyMonkey  Edited  Added words voted for  Removed words voted against  When vote was close – went with current status

6  640 keywords  148 synonyms  201 NBII keywords  21 GCMD keywords

7  Is additional editing required?  Who decides if it is an LTER “official” list?  And what does it mean if it is?  What procedures should be followed for subsequent editing of the list?  Who should manage the list database?  Term  Scope  Definition  Synonyms

8  Autocomplete search tool  - Duane Costa  Autocomplete keywording tool - Duane Costa  Update-document-keywords tool?  Advanced search tool?

9  There is general agreement that keywords are most useful when they can be tied to other keywords  How do we create the needed keyword taxonomy(s)?  Barbara Benson has done some work looking at other hierarchies (KNB, GCMD)  Giri Palanisamy has sent us the broader, narrower and related terms for the ~1/3 of the words that are also in the NBII thesaurus

10  the existing KNB browse hierarchy is rather limited (the LTER version that gives the number of hits is a good feature)  a browse hierarchy could be useful to sites in developing one at the site  it could be hooked into any tools that are developed to assist in assigning keywords to datasets  it could be used in a tool that enables the creation of a browse hierarchy from a keyword list  it could assist in searches done by keywords in offering an option to go up a level from the keyword to a broader concept and thus yield a high number of hits in the search

11  Taxonomic and place keywords were excluded from the science keywords  Do we need a gazetteer for places?  Do we need taxonomic lists & tools for taxonomic information?  Are there other types of lists that are needed?

12  Feedback on tools  Ideas for additional tools  Hierarchy

13  LTER words emerging organically  Not just general search  Other efforts  Vegetation ecology community interested in ontologies for vegetation traits  LTER words are not specialized  Would be good to keep in touch with other efforts  SONET – intercommunication (Gries) critical  Rob Raskin taking GCMD and ontologizing it  NASA is developing “Suite” – upper level ontology  Semtools – (O’Brien) – using Morpho and making it better database management system – using subsumption hierarchies in OWL  OWL allows use of generic applications (JENA) – standard format

14  Autocompletion tools helpful for NEW EML  But need tools for updating existing metadata  Having a first cut of recommendations would help  Tool that does suggestions based on document content would be helpful  Semantic annotation  Hook to parents, children and related  Educate PI’s on using list is important  Just availability of list is important

15  Automatic annotation with broader terms  Identify “unfindable” datasets – what datasets have no LTER Keywords or synonyms?  Go dataset by dataset and see which have no hits  EML is limited in how it assigns keyword lists  Could target tools at keyword set  Namespacing control could be relaxed to go beyond “theme” and “place”

16  Ecotrends – predated LTER list  Would have been good to have LTER list  Eventually would like to integrate  May be able to exploit synonym rings  When title and dataset don’t match – Title says “Productivity” but attribute is “biomass” need to examine holistically  Linking terms to definitions needed  Also taxonomic database would be useful for “bugs” (true bugs vs insects)

17  Practices in design  When develop – always think about how they are tied to organizational routines  Think proactively about how to make it routine – getting people to think in categories  Pursue Polytaxonomys based on Barbara’s list  Develop synonym list further  See how keyword lists match  AND has 3-level hierarchy  Start at top or bottom in adding….

Download ppt "Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data."

Similar presentations

Ads by Google