Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spire News Joel Sachs Spire Semantic Prototypes In Ecoinformaics UMBC Ebiquity UMBC Ebiquity UMD MIND SWAP UMD MIND SWAP NASA GSFC.

Similar presentations


Presentation on theme: "Spire News Joel Sachs Spire Semantic Prototypes In Ecoinformaics UMBC Ebiquity UMBC Ebiquity UMD MIND SWAP UMD MIND SWAP NASA GSFC."— Presentation transcript:

1 Spire News Joel Sachs jsachs@cs.umbc.edu

2 Spire Semantic Prototypes In Ecoinformaics UMBC Ebiquity UMBC Ebiquity UMD MIND SWAP UMD MIND SWAP NASA GSFC NASA GSFC RMBL Peace RMBL Peace UC Davis ICE UC Davis ICE NBII Semantic Web Tools Agents Information Retrieval Invasive Species Forecasting System Remote Sensing Data Food Webs Semantic CAIN Ontology Development Dissemination Prototype applications Infrastructure Ontology of Ecological Interaction

3

4

5 Overview of Talk What (and why) is the semantic web? –History –The tragic legacy of ontologies –Hope for the future Some Spire achievements –Elvis, Ethan, Swoogle, Tripleshop, RDF123 Semantic Eco-blogging –Spotter, Splickr, Fieldmarking –Bioblitzes Linked Data –Why? How? A tiny data browsing demo

6 Semantic Web? The Semantic Web arose out of a confluence of 3 communities. –Hypertext; AI; Electronic publishing The AI component achieved early dominance. –Knowledge representation; Ontologies; First order logic, etc. This was exciting for some, and confounding for others.

7 The next 3 slides are from “The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise”, by Adam Pease

8 High Level Distinctions The first fundamental distinction is that between ‘Physical’ (things which have a position in space/time) and ‘Abstract’ (things which don’t)‏ Entity Physical Abstract

9 High Level Distinctions Partition of ‘Physical’ into ‘Objects’ and ‘Processes’ Physical Object Process

10 Processes DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining Separating InternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChange ShapeChange IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction Maneuver Motion BodyMotion DirectionChange Transfer Transportation Radiating

11 Interoperability through Simplicity

12 Spire So far: Ontologies “The Big Experiment” –A collection of linked ontologies enabling highly detailed descriptions of ecological interaction. –Supports WoW - Webs on the Web SpireEcoConcepts –Medium size. Used for expressing trophic links and related information, including bibliographic info on studies. ETHAN –Evolutionary trees and natural history. –Huge. Observation ontology –For semantic eco-blogging. –Tiny. Invasives ontology –Lightweight and extensible in the most trivial of manners.

13 ETHAN Engineering The semantics behind an arbitrary relation can often be expressed using the rdfs:subClassOf relation, as opposed to rdf:property. Doing so has a number of benefits: It seems to be more computationally efficient. (We have no hard evidence for this, yet.) It makes it easy to introduce a new concept, especially in a distributed manner. (See our discussion of conservation information below.) It leads to fewer disagreements among scientists and, therefore, greater chance of ontology adoption (We have anecdotal evidence for this.)

14 A Brief Tour of Some Relevant Ontologies http://spire.umbc.edu/ontologies/InvasivesOntology.owl http://spire.umbc.edu/ontologies/lists/ http://spire.umbc.edu/ontologies/lists/USFWSInjuriousAnimals.owl http://spire.umbc.edu/ontologies/lists/Cal-IPC.owl

15 Spire So far … ELVIS –A suite of tools motivated by the belief that food web structure plays a role in determining the success or failure of potential species invasions. –Species List Constructor. Give a location, get a species list. –Food Web Constructor. Give a species list, get a food web. –Evidence Provider. Drill down on a predicted trophic link, and see evidence for and against the existence of that link. This illustrates our general attitude of moving away from “answer providers” to “evidence providers”.

16 Bacteria Microprotozoa Amphithoe longimana Caprella penantis Cymadusa compta Lembos rectangularis Batea catharinensis Ostracoda Melanitta Tadorna tadorna ELVIS: Ecosystem Localization, Visualization, and Information System Oreochromis niloticus Nile tilapia ? ?... Species list constructor Food web constructor

17 Food Web Constructor Predict food web links using database and taxonomic reasoning. In a new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected

18 Food Web Constructor generates possible links

19 Evidence provider gives details

20 So far: Integration Swoogle –Google for the semantic web. –Crawls and indexes RDF documents. –Computes metadata, including “ontoRank”. Tripleshop –A SPARQL query engine. Leave out the FROM clause. Data comes from Swoogle –Semi-automatic dataset constructor –Our main platform for integration

21 Google has made us smarter

22 But what about our agents? tell register Agents still have a very minimal understanding of text and images.

23

24 By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size. 80 ontologies were found that had these three terms Let’s look at this one

25 Basic Metadata hasDateDiscoveredhasDateDiscovered: 2005-01-17 hasDatePinghasDatePing: 2006-03-21 hasPingStatehasPingState: PingModified typetype: SemanticWebDocument isEmbeddedisEmbedded: false hasGrammarhasGrammar: RDFXML hasParseStatehasParseState: ParseSuccess hasDateLastmodifiedhasDateLastmodified: 2005-04-29 hasDateCachehasDateCache: 2006-03-21 hasEncodinghasEncoding: ISO-8859-1 hasLengthhasLength: 18K hasCntTriplehasCntTriple: 311.00 hasOntoRatiohasOntoRatio: 0.98 hasCntSwthasCntSwt: 94.00 hasCntSwtDefhasCntSwtDef: 72.00 hasCntInstancehasCntInstance: 8.00

26

27

28 These are the namespaces this ontology uses. Clicking on one shows all of the documents using the namespace. All of this is available in RDF form for the agents among us.

29 Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.

30 10K terms associatged with “person”! Ordered by use. Let’s look at foaf:Person’s metadata

31 UMBC Triple Shop http://sparql.cs.umbc.edu/tripleshop2 Online SPARQL RDF query processing based on HP ’ s Jena and Joseki with several interesting features Selectable level of inference over model Automatically finds SWDs for give queries using Swoogle backend database –Provide dataset creation wizard –Dataset can be stored on our server or downloaded –Tag, share and search over saved datasets

32 Who knows Anupam Joshi? Show me their names, email address and pictures

33 The UMBC ebiquity site publishes lots of RDF data, including FOAF profiles

34 No FROM clause! Constraints on where the data comes from

35

36 Swoogle found 292 RDF data files that appear relevant to answering our query

37 Let’s save the dataset before we use it

38 And tag it so we and others can find it more easily.

39 He has many friends!

40 Semantic Eco-Blogging: Some Background 1/3 of all new web content is user generated Scientific data is increasingly a part of Web 2.0/3.0 How easy can we make semantic annotation? Climate change drives ecological change Alters species distribution Wuethrich, B. How Climate Change Alters Rhythms of the Wild Bernice Wuethrich (4 February 2000) Science 287 (5454), 793. Drives evolution Bradshaw, W. E., and Holzapfel, C. M. 2001. Genetic shift in photoperiodic response correlated with global warming. Proc. Nat. Acad Sci. USA. 98:14509-14511

41 Semantic Eco-blogging. Eco-blogs are popping up all over the place. –Bloggers are both amateur nature-lovers, and working biologists. “On April 24 in Washington DC, I saw a leopard slug. Here’s a picture.” These observations are, potentially, an important part of the ecological record. –“What was the earliest sighting of a robin hatching?” –“What was the Northernmost sighting of the Asian Longhorn Beetle?” –Etc. System concept: global human sensor net. SPOTTER –A firefox plugin for creating OWL from field observations. –Spotter map lets you see all “spots” –Being tested at http://ebiquity.umbc.edu/fieldmarking/ and other blogs near you.http://ebiquity.umbc.edu/fieldmarking/

42

43

44

45

46 You can download spotter at http://spire.umbc.edu/spotterhttp://spire.umbc.edu/spotter Try it out, and then view your observations on the Spotter map: http://spire.umbc.edu/spotter/spotterMap.php http://spire.umbc.edu/spotter/spotterMap.php

47

48

49

50

51 The Blogger Bioblitz Bioblitz: a 24 hour inventory of all living things in a given area. –Dual aims of establishing degree of biodiversity and popularizing science. The recent Blogger bioblitz. –17 bloggers from: –Sitka, Alaska; Greece; Toronto; Santa Cruz; DC; etc. 1200 observations. Tripleshop was able, by combining the observations with background data, to respond to a number of ad-hoc queries. –E.g. “Show all observations of species listed as being either invasive or injurious.” resulted in 47 hits.

52

53

54

55

56 Splickr Flickr has been handling geotagged pictures since August 2006. Roughly 30 million geotagged photos in the first year. –2.1 million so far this month. Splickr is a Flickr/Yahoo maps mashup that makes it easy to find pictures of particular species in a given area. –All data gets represented in OWL.

57

58

59

60 RDF123 A flexible and graphical means to map from spreadsheets to RDF The mapping is stored as an OWL file An RDF123 webservice takes a Google spreadsheet and a map as input, outputs RDF. So you can do all your work, collaboratively, in the spreadsheet, and you never have to export to RDF!

61 Taxonomy for biologists is a little bit tricky. Columns A-F (Phylum, Class, Order, Family, Genus, Species) has a rule: i. If there is a value for Column F (Species), then the value of Columns E (Genus) and F should be joined with an underscore, and mapped to ob#hasTaxon. ob#hasTaxon ii. If there is no value for Column F, then the rightmost column, amongst columns A-E, that has a value gets mapped to ob#hasTaxon.ob#hasTaxon

62 Eco-Blogging: Next steps Make every bioblitz a blogger bioblitz –Use RDF123 –Rock Creek, MD and LA county coming up Drop-down invasives lists in Splickr –E.g. find all photos in Europe of species on the “Worst Invaders of Europe” list Mining other sources –E.g. birdwatcher listservs Making semantic eco-blogging easier –We will continue to work with children. Aggressively pursue a Linked Data approach.

63 A Few Words on Linked Data “Linked Data on the Web” is a collection of best practices for publishing data on the semantic web. –Distinguishing between Information and non-information resources. –303 redirects and content negotiation. –HTTP URIs for everything on Earth. –owl:sameAs It is also, to an extent, a rebranding of the semantic web. –Much more emphasis on links amongst datasets. –Much less emphasis on formal semantics. Linked data can be browsed, in much the same way we browse the traditional web. –So we can find data either by searching for it (with Swoogle/Tripleshop) or by surfing our way to it.

64 Some Context Before search engines, we found things on the web by browsing. Browsing still has its charms. –And benefits. On the semantic web: –One way to build a dataset: Swoogle/Tripleshop –Another: data browsing … A “thing-centric” approach.

65 Other Thoughts and Deeds Web 2.0/3.0 is designed for accommodating a multiplicity of perspectives and worldviews. –Neutrality not required Spotter as a general purpose annotation tool? Experiment in integrating water quality and invasive species occurrence data. –EPA, USGS, GBIF, EEA(?) –SODA Pacific Rim data New ELVIS: Extinction patterns in Sierra Nevada lakes. –Invasive trout are causing local extinctions. –We can compare with model predictions made by our PEaCE lab partners.

66 GBIF Scenarios Check out the 3 climate change scenarios (land use, health, and agriculture) from the presentation by Hannu Saarenmaa and Jeremy Kerr at http://circa.gbif.net/Public/irc/gbif/ict/library?l=/presentations/gbif_scenarios_ppt/_EN_6.0_&a=d

67 8 Step Scenario Development Process i. Decide on selected species. ii. Set criteria for data. (spans 30 years, georeferenced, etc.) iii. Investigate data availability. (GBIF, GAP, etc.) iv. Improve quality and access to data. v. Choose modeling approach. (Eg. Ecological Niche Modeling with Open Modeller Framework.) vi. Acquire and transform climate change and environment data. vii. Execute models. viii. Present the results. Could be build a toolkit to ease the “ data ” steps, i.e. steps 2, 3, 4, 6

68 Acknowledgements Cynthia Parr Andriy Parafiynyk Lushan Han Rong Pan Li Ding David Wang Tim Finn NSF NBII

69 Some References For a walk-through of Spotter, Tripleshop, Elvis, or our other tools, email jsachs@cs.umbc.edu Two relevant papers from our research group: Adding Semantics to Social Websites for Citizen Science http://ebiquity.umbc.edu/paper/html/id/365/Adding-Semantics-to-Social-Websites-for- Citizen-Science http://ebiquity.umbc.edu/paper/html/id/365/Adding-Semantics-to-Social-Websites-for- Citizen-Science Using the Semantic Web to Support Ecoinformatics, http://ebiquity.umbc.edu/paper/html/id/319/Using-the-Semantic-Web-to-Support- Ecoinformatics http://ebiquity.umbc.edu/paper/html/id/319/Using-the-Semantic-Web-to-Support- Ecoinformatics An introduction to linked data: How to Publish Linked Data on the Web, http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/


Download ppt "Spire News Joel Sachs Spire Semantic Prototypes In Ecoinformaics UMBC Ebiquity UMBC Ebiquity UMD MIND SWAP UMD MIND SWAP NASA GSFC."

Similar presentations


Ads by Google