Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connecting Researchers and Resources with eagle-i Tenille Johnson, MLIS ALA Midwinter 2016 / Linked Library Data Interest Group January 9, 2016 Harvard.

Similar presentations


Presentation on theme: "Connecting Researchers and Resources with eagle-i Tenille Johnson, MLIS ALA Midwinter 2016 / Linked Library Data Interest Group January 9, 2016 Harvard."— Presentation transcript:

1 Connecting Researchers and Resources with eagle-i Tenille Johnson, MLIS ALA Midwinter 2016 / Linked Library Data Interest Group January 9, 2016 Harvard Medical School

2 How can we support biomedical research? 2 In an environment of lower funding and greater competition, how can we support our academic researchers? Help researchers overcome barriers to do more with less, faster and more easily. ANSWER QUESTION

3 33 … by supporting Resource Discovery 2009-ARRA funded to develop proof of concept system for sharing information about research resources  Cell lines, Antibodies, Plasmids  Biological specimens  Animal models  Human studies  Protocols  Instruments  Software and algorithms  Core facilities and services  Cell lines, Antibodies, Plasmids  Biological specimens  Animal models  Human studies  Protocols  Instruments  Software and algorithms  Core facilities and services

4 The Problem: Invisible Resources 1.Deep Freeze always have it never know where to find it 2.Toss reduce clutter save on space and energy Gone for good – may need it again 3.Organize always have it always find it save time and money in long run takes time in the short run hard to find collaborators who need the resource 1.Create start now control quality time money 2.Purchase fast and easy costly may not be available 3.Borrow/Collaborate often free faster than remaking hard to find collaborators that have the resource Researchers that have… Researchers that need…

5 The Solution: Resource Discovery Capture Curate Publish Search Researchers that have… Researchers that need…

6 6 How is does this help institutions? Reduce waste and redundancy. Focus grant applications, time, and effort where it will do the most good by re-using existing resources Catalyze innovative collaborations. Help foster new multi-disciplinary efforts within and beyond your institution’s walls. Promote institutional accomplishments. Catalog the wealth of resources available at the institution and publicize its unique expertise. Find strategic partner institutions. Identify other institutions with complementary research resources. Align with and support federal sharing policies. Position itself as a leader in data and resource sharing.

7 77 What is eagle-i? An open source software platform A national network A data model

8 88 What is eagle-i? An open source software platform A national network A data model

9 99 Designed to follow LOD principles Requirements Network for cross-institutional sharing Freely accessible Accommodate a broad range of resource types Requirements Network for cross-institutional sharing Freely accessible Accommodate a broad range of resource types Architecturally: Ontology-driven Distributed network Uses Semantic Web technologies Follows Linked Open Data principles Functionally: Open Source NO login required (freely accessible to search engines) Flexible Web-based Supports rich search features Decisions Image credit: http://www.w3.org/DesignIssues/LinkedData.html Subject Object Predicate

10 10 Architecture of the eagle-i platform

11 eagle-i applications SWEET Data Entry Tool Institutions can assign different permission levels to local users to control who is allowed to edit records in different workflow states. SWEET Data Entry Tool Institutions can assign different permission levels to local users to control who is allowed to edit records in different workflow states. Central Search Browse all sites and resources in the entire Network or apply filters to narrow your results. Search allows for anonymous contact to protect resource owner’s privacy. Central Search Browse all sites and resources in the entire Network or apply filters to narrow your results. Search allows for anonymous contact to protect resource owner’s privacy. Public SPARQL/iPS Cell Search Data at all sites is available via public SPARQL endpoints. The iPS cell search tool executes a federated SPARQL query across the network to leverage the richness of eagle-i’s linked data with a simple, user friendly UI. Public SPARQL/iPS Cell Search Data at all sites is available via public SPARQL endpoints. The iPS cell search tool executes a federated SPARQL query across the network to leverage the richness of eagle-i’s linked data with a simple, user friendly UI. Ontology Browser Users can explore the eagle-i ontology hierarchies and see class and property definitions. Ontology Browser Users can explore the eagle-i ontology hierarchies and see class and property definitions. Get the code eagle-i’s open source software is available at open.med.harvard.edu. open.med.harvard.edu Get the code eagle-i’s open source software is available at open.med.harvard.edu. open.med.harvard.edu

12 Creating a culture of attribution for sharing

13 primary cell culture Family human subject Genetic alteration Diagnosis History iPSC Protocol Potency Karyotype data Assay Data Algorithm knowledge

14 14 Searching for that article I read… ipsc’s; ips cells; induced pluripotent stem cells; stem cells parkinson; parkinson’s; PD; idiopathic parkinsonism; hypokinetic rigid syndrome; HRS GBA(N370S); GBA-PD; glucocerebrosidase mutations NYSCF; new york stem cell foundation Twins; monozygotic; discordant phenotypes Re-finding that article after two weeks

15 15 What is eagle-i? An open source software platform A national network A data model

16 eagle-i software is ontology driven eagle-i Research Resource Ontology (ERO) Developed by the eagle-i Ontology team at Oregon Health and Science University, lead by Melissa Haendel. eagle-i Research Resource Ontology (ERO) Developed by the eagle-i Ontology team at Oregon Health and Science University, lead by Melissa Haendel. Why an ontology?  Terms are defined  Terms are arranged in a hierarchy  Relationships between the terms are defined  Expressed in a language that is consumable by computers  Data can easily be published as Linked Open Data Why an ontology?  Terms are defined  Terms are arranged in a hierarchy  Relationships between the terms are defined  Expressed in a language that is consumable by computers  Data can easily be published as Linked Open Data ERO at a glance  Models biomedical research resources.  Drives the behavior of the eagle-i applications.  Leverages existing ontologies wherever possible.  4,061 classes in the ero.owl core file, including 1,979 created de novo.  133 object properties.  60 data properties.  30,700 classes including import files. ERO at a glance  Models biomedical research resources.  Drives the behavior of the eagle-i applications.  Leverages existing ontologies wherever possible.  4,061 classes in the ero.owl core file, including 1,979 created de novo.  133 object properties.  60 data properties.  30,700 classes including import files.

17 Common ontologies promote reusability A R Antibody Registry Basic Formal Ontology (BFO) Berkeley Bioinformatics Open-source Projects Biomedical Resource Ontology Cell Line Ontology (CLO) Cell Ontology Consortium Disease Ontology Gene Ontology Information Artifact Ontology LAMHDI MP (Mammalian Phenotype) NCBO Neuroscience Information Framework OBO Foundry Ontology of Biomedical Investigations Ontology for Clinical Research (OCRe) Pato (Phenotypic Quality Ontology) Phenoscape Phenotype Research Coordination Network Reagent Ontology Sequence Ontology Software Ontology Uberon VIVO

18 18 Using a layered ontology approach eagle-i applications provide user-friendly, ontologically correct interaction with the underlying RDF data. Built using a layered ontology approach  Ontology layers: goal is to decouple resource representation from information used for application appearance and behavior  Application ontologies annotate domain ontologies with application-specific information and restrictions, but are not included in public ontology releases. Built using a layered ontology approach  Ontology layers: goal is to decouple resource representation from information used for application appearance and behavior  Application ontologies annotate domain ontologies with application-specific information and restrictions, but are not included in public ontology releases. Modeling dichotomy  ERO models the world of biomedical resources  Software needs a model from which to derive behavior Complexity  ERO is interoperable; it builds on an upper ontology and imports numerous terms  Not all ontology constructs translate into user- level constructs Modeling dichotomy  ERO models the world of biomedical resources  Software needs a model from which to derive behavior Complexity  ERO is interoperable; it builds on an upper ontology and imports numerous terms  Not all ontology constructs translate into user- level constructs The ERO is now fully integrated with the VIVO-ISF ontology supported by the Open Research Information Framework (OpenRIF)! What does this mean for end users? NOTHING. The ERO is now fully integrated with the VIVO-ISF ontology supported by the Open Research Information Framework (OpenRIF)! What does this mean for end users? NOTHING. On the backend: eagle-i software consumes two sets of OWL files:  application files (housed in the Harvard SVN)  eagle-i modules generated from the VIVO-ISF  contain select ISF annotations only  content pulled from the ISF trunk via scripts that build the module configuration files On the backend: eagle-i software consumes two sets of OWL files:  application files (housed in the Harvard SVN)  eagle-i modules generated from the VIVO-ISF  contain select ISF annotations only  content pulled from the ISF trunk via scripts that build the module configuration files

19 What does it mean to be ontology driven software?

20 Classes annotated with ‘primary resource type’ ‘eagle-i preferred label’ is used for the display name. Property annotated as ‘’primary property’ ‘eagle-i preferred definition’ is used for tooltips Technique is annotated as ‘referenced taxonomy’ Construct insert is an example of a resource annotated as an ‘embedded class’

21 Class Groups guide application functionality ClassGroup_Transfer ableResourceType tells the search application which contact form to use ClassGroup_Instance Create identifies asserted types in the eagle-i repository ClassGroup_SearchFlatten identifies classes whose properties should be flattened to their referencing instances in search

22

23 23 Benefits and Challenges Benefits  Controlled vocabulary ensures consistency  Enables resources to be linked to other kinds of data  Fosters system flexibility and data interoperability  Facilitates data sharing, retrieval and validation Benefits  Controlled vocabulary ensures consistency  Enables resources to be linked to other kinds of data  Fosters system flexibility and data interoperability  Facilitates data sharing, retrieval and validation Challenges  Granularity and scope of content  Managing external coordination of ontology imports, requests, collaborations  Determining how to best utilize ontological reasoning and expressivity  Integration of the ontology into the user interface applications - both a feature and a challenge Challenges  Granularity and scope of content  Managing external coordination of ontology imports, requests, collaborations  Determining how to best utilize ontological reasoning and expressivity  Integration of the ontology into the user interface applications - both a feature and a challenge

24 24 What is eagle-i? An open source software platform A national network A data model

25 A National Network The eagle-i Consortium The original 9 sites were chosen for the diversity of location, population, and resources. The eagle-i Consortium The original 9 sites were chosen for the diversity of location, population, and resources. The eagle-i Network We’ve subsequently expanded to include a variety of biorepositories and other academic institutions, including the REACH NC sub-network at the University of North Carolina. The eagle-i Network We’ve subsequently expanded to include a variety of biorepositories and other academic institutions, including the REACH NC sub-network at the University of North Carolina. International Expansion In collaboration with several partners, including Ontoforce, eagle-i is expanding into Europe, beginning with implementation at Ghent University in Belgium. International Expansion In collaboration with several partners, including Ontoforce, eagle-i is expanding into Europe, beginning with implementation at Ghent University in Belgium.

26

27 27 Some of Our Participating Institutions  Addgene  Charles R. Drew University of Medicine and Science  City College of New York, CUNY  Clark Atlanta University  Dartmouth College  Developmental Studies Hybridoma Bank  Duke University  East Carolina University  Florida Agricultural and Mechanical University  Fred Hutchinson Cancer Research Center  Harvard University  Howard University  Hunter College, CUNY  Institute of Translational Health Sciences  Jackson State University  Meharry Medical College  Michigan State University  Montana State University  Morehouse School of Medicine  Addgene  Charles R. Drew University of Medicine and Science  City College of New York, CUNY  Clark Atlanta University  Dartmouth College  Developmental Studies Hybridoma Bank  Duke University  East Carolina University  Florida Agricultural and Mechanical University  Fred Hutchinson Cancer Research Center  Harvard University  Howard University  Hunter College, CUNY  Institute of Translational Health Sciences  Jackson State University  Meharry Medical College  Michigan State University  Montana State University  Morehouse School of Medicine  New York Stem Cell Foundation  North Carolina A and T State University  NYU Langone School of Medicine  The Ohio State University  Oregon Health and Science University  Ponce School of Medicine and Health Sciences  Texas Southern University  Tuskegee University  University of Alaska Fairbanks  Universidad Central del Caribe  University of Hawai'i Manoa  University of North Carolina at Chapel Hill  University of North Carolina at Charlotte  University of North Carolina at Greensboro  University of Pennsylvania  University of Puerto Rico  University of Texas at El Paso  University of Texas at San Antonio  Vanderbilt University  Xavier University of Louisiana  Washington University NIMH Genetics  New York Stem Cell Foundation  North Carolina A and T State University  NYU Langone School of Medicine  The Ohio State University  Oregon Health and Science University  Ponce School of Medicine and Health Sciences  Texas Southern University  Tuskegee University  University of Alaska Fairbanks  Universidad Central del Caribe  University of Hawai'i Manoa  University of North Carolina at Chapel Hill  University of North Carolina at Charlotte  University of North Carolina at Greensboro  University of Pennsylvania  University of Puerto Rico  University of Texas at El Paso  University of Texas at San Antonio  Vanderbilt University  Xavier University of Louisiana  Washington University NIMH Genetics

28 28 Integrating eagle-i data RDF XML

29

30 30 Current work and future projects Courses Modeling  Pilot project to enable discovery of educational resources at Clinical and Translational Science Awards (CTSA) sites  Will include a custom SPARQL search interface. Induced Pluripotent Stem Cells  Currently in the process of entering data from WiCell from a $78M NHLBI study to generate iPSC lines from well characterized human subjects. Total = 1,541-1,561 human subjects and ~2,500 iPSC lines.  Also anticipating lines from RUCDR, SKiP, EBiSC (stem bancc) and others in the next year.  Codify Consent forms (NYSCF)  Integrate resource info with research data (Synapse) Induced Pluripotent Stem Cells  Currently in the process of entering data from WiCell from a $78M NHLBI study to generate iPSC lines from well characterized human subjects. Total = 1,541-1,561 human subjects and ~2,500 iPSC lines.  Also anticipating lines from RUCDR, SKiP, EBiSC (stem bancc) and others in the next year.  Codify Consent forms (NYSCF)  Integrate resource info with research data (Synapse)

31 31 Getting libraries involved OutreachEducationImplementation Librarians: understand the needs and priorities of their local communities Are a familiar and trusted point of reference for the users and stakeholders at all levels of their institution Possess the experience working with metadata, information management skills, and domain expertise needed

32 Implementation: Engaging the Community Getting started at an institution 1.Create a landscape of local research resources and a prioritized plan for collecting information about them. 2.Enter data on behalf of resource providers or train resource providers to use the SWEET directly. Getting started at an institution 1.Create a landscape of local research resources and a prioritized plan for collecting information about them. 2.Enter data on behalf of resource providers or train resource providers to use the SWEET directly. Garbage in, garbage out  Data curation ensures resource descriptions are consistent, accurate, and relevant to scientific research users.  Long term maintenance required to prevent stale data. Quality Assurance Tools  Suite of maintenance queries can be used to find records in need of updates.  A centrally-based curation team guides institutional curators with the help of extensive data curation guidelines. Garbage in, garbage out  Data curation ensures resource descriptions are consistent, accurate, and relevant to scientific research users.  Long term maintenance required to prevent stale data. Quality Assurance Tools  Suite of maintenance queries can be used to find records in need of updates.  A centrally-based curation team guides institutional curators with the help of extensive data curation guidelines.

33 33 Thank you! Open Access. Open Source. Open Network. Contact us: tenille_johnson@hms.harvard.edu info@eagle-i.net Search the network http://www.eagle-i.net/ Explore the ontology https://search.eagle-i.net/model/ Get the code open.med.harvard.edu/project/eagle-i/ Come in We’re OPEN


Download ppt "Connecting Researchers and Resources with eagle-i Tenille Johnson, MLIS ALA Midwinter 2016 / Linked Library Data Interest Group January 9, 2016 Harvard."

Similar presentations


Ads by Google