Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linked Open Data for ACademia An Introduction to LOD for Biodiversity Hideaki Takeda / ORCID:0000-0002-2909-7163 Professor, National Institute.

Similar presentations


Presentation on theme: "Linked Open Data for ACademia An Introduction to LOD for Biodiversity Hideaki Takeda / ORCID:0000-0002-2909-7163 Professor, National Institute."— Presentation transcript:

1 Linked Open Data for ACademia An Introduction to LOD for Biodiversity Hideaki Takeda / ORCID: Professor, National Institute of Informatics Collaborators: Utsugi Jinbo, Akihiro Kameda, Fumihiro Kato, Ikki Ohmukai PNC 2013 Annual Conference, 11 December, 2013

2 Linked Open Data for ACademia Web of Documents

3 Linked Open Data for ACademia Web of Data Another data to the observation Data identical to this What’s the meaning of the data? Inter-connection between data in difference data sources is enabled

4 Linked Open Data for ACademia Linked Data Principles The four rules for Linked Data – Use URIs as names for things Give a URI to every object in the world! – Use HTTP URIs so that people can look up those names. Don’t use URN – When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Provide machine-readable data for URI – Include links to other URIs. so that they can discover more things. Make data linked together just like Web Linked Data, TBL,

5 Linked Open Data for ACademia How to express data in Linked Data Use RDF(+RDFS, OWL) – Very simple :. rdfs:type foaf:Person. foaf:name “Hideaki Takeda”. foaf:gender “male”. foaf:knows. ~takeda#me /id/person07113 foaf:knows foaf:Person rdfs:type foaf:name foaf:gender “Hideaki Takeda” “male”

6 Linked Open Data for ACademia “ ” Linked Data の記述 ~takeda#me id/person foaf:knows foaf:Person rdfs:type foaf:name foaf:gender owl:sameAs dbpprop:birthDate dbpprop:birthPlace dbpprop:name dbpedia:Computer_scientist dbpprop:occupation “Hideaki Takeda” “male” “London, England”“Sir Tim Berners-Lee”

7 Linked Open Data for ACademia Linking Open Data (LOD) The project to collect published Linked Data Major Linked Data (Translated from the original resources) – Dbpedia (Wikipedia) 270 Million Triples – Geonames : Geo names and their latitudes and longitudes, 93 Million Triples – MusicBrainz : Music – WordNet : Dictionary – DBLP bibliography : Bibliography for technical papers. 28 Million Triples – US Census Data: 1 Billion Triples ( Crawling) – FOAF (Friend Of A Friend) ( Wrapper ) – Flickr Wrapper

8 Linked Open Data for ACademia

9

10 LOD Cloud (Linking Open Data)

11 Linked Open Data for ACademia Benefits of LOD for Science Truly de-centralized database – No need for central database – Everyone can create one and join the cloud! Truly open and sharable data and schemata – Easy for re-use and mash-up – Easy for cross-domain/discipline use and connection A single format for all kind of data – Easy for data processing

12 Linked Open Data for ACademia Bio2RDF Bio2RDF is an open source framework to produce and provide biological linked data that uses simple conventions on the emerging semantic web Bio2RDF reduces the time and effort involved in data integration so that you can get to doing science 19 datasets; 1,010,758,291 triples At the heart of Linked Data for the Life Sciences

13 Linked Open Data for ACademia Alison Callahan, José Cruz-Toledo, Peter Ansell, Michel Dumontier: Bio2RDF Release 2: Improved Coverage, Interoperability and Provenance of Life Science Linked Data, The Semantic Web: Semantics and Big Data, Lecture Notes in Computer Science Volume 7882, 2013, pp

14 Linked Open Data for ACademia Bio2RDF

15 Linked Open Data for ACademia LODAC Project - connecting academic data - LODAC SPECIES: Connecting species data by name Specimen DB Species Info. DB Taxon Name DB GBIF BioSci. DB Research DB No. of Names : No. of Triples : 14,532,449 Data from Source BIntegrated data dc:references dc:creator crm:P55_has_current_location dc:creator Data from Source A Work Museum Creator Minimum Data to identify entities Raw Data for entities LODAC Museum : LOD of data in museums App. for query expansion CKAN Japanese: Catalog for Open Data DBPedia Japanese LODAC Location: Integration of location information

16 Linked Open Data for ACademia LODAC SPECIES: Linking Species Information with names Museum Specimen DB Species Info. DB Taxon Name LOD GBIF BioSci. DB Research DB No. of Species Names : No. of Triples : 14,532,449

17 Linked Open Data for ACademia LODAC species: Motivation and goal To enhance researchers to search and use biodiversity data, we are going to: – Build a data hub to connect data collected in various biological fields – Build a pilot system focused on taxonomic information on species based on different datasets using Linked Data. Species name play as a key to combine heterogeneous data on biodiversity

18 Linked Open Data for ACademia Selection of resources Species names – Building Dictionary for Life Science (BDLS) (Database Center for Life Science) Taxon name list used in literatures and specimen labels – The Current Checklist of Japanese Butterflies (Inomata et al., 2011) Authorized by the butterfly experts – Red Data List in Japan (Ministry of Environment) Specimens – Bryophytes Specimen Collection (National Institute of Polar Research) – Specimens data published by Science Museum Net (S-net) (S-Net conducted by National Museum of Nature and Science/ JBIF) Specimens data provided by participant museums

19 Linked Open Data for ACademia Resources for LODAC species/museums BDLS Butterflies Bryophytes Taxa Source Taxon name Specimen Institution Name Collected date Collection locality Red data rank Red List

20 Linked Open Data for ACademia Resources linked to LODAC species 1.DBPedia 2.NCBI Taxonomy (National Center for Biotechnology Information) 3.Barcode of Life Data Systems 4.Encyclopedia of Life

21 Linked Open Data for ACademia The data model for species information Specimen rdf:type species institutionName collectedDate collectionLocality crm:has_current_location Bryophytes TaxonName ScientificName CommonName TaxonRank species rdfs:subClassOf rdf:type hasCommonName hasScientificName hasSuperTaxon rdf:type hasTaxonRank rdf:type hasTaxonRank rdf:type Butterfly BDLS dcterms:source dcterms:publisher : Named Graph : owl:Class

22 Linked Open Data for ACademia Vocabulary for LODAC species Two vocabularies were created (name, conservation status) Currently Darwin Core and related standard are not used – Standardization of TDWG standard in RDF is under discussion (TDWG VoMaG) – Some required properties are not defined

23 Linked Open Data for ACademia Results: web search

24 Linked Open Data for ACademia Search engine for names

25 Linked Open Data for ACademia Results: web interface Red list status

26 Linked Open Data for ACademia Results: in RDF Hirasea profundispira

27 Linked Open Data for ACademia SPARQL Endpoint available

28 Linked Open Data for ACademia An application: Query expansion for paper search

29 Linked Open Data for ACademia Future implementation More comprehensive data is required – Increase amount of data More authorized species checklists More specimen data published by institutes Semi-automatic protocol for converting data – Increase type of data Species description and profile, references etc. – Copyright issues Open licenses are not popular Harmonization between LODAC and TDWG vocabularies Linking to more resources / systems

30 Linked Open Data for ACademia Conclusion Data and Web – Great Potential! Linked Data - Exploit the power of Web – – Simple Structure: URI and RDF – Truly distributed data management – Easy to link to each other – Suitable for distributed and cross-domain research data LODAC Species – Linking species data by name – A hub to connect data across various research domains More potentials of LOD on biodiversity domain


Download ppt "Linked Open Data for ACademia An Introduction to LOD for Biodiversity Hideaki Takeda / ORCID:0000-0002-2909-7163 Professor, National Institute."

Similar presentations


Ads by Google