Presentation on theme: "Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University."— Presentation transcript:
Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University of Iowa
Thanks to: Noshir Contractor, Northwestern University Holly J. Falk-Krzesinski, Elsevier & Northwestern University Melissa Haendel, Oregon Health Science University Michael Conlon, University of Florida
Thesis of Talk We are well into the first generation of research networking tools, but we are still conceptualizing what successive generations might look like. –As support for this, consider that the majority of participating institutions are still in the process of fully deploying their first research networking system.
Research Networking Programmatic support for discovery and use of research and scholarly information regarding people and resources. They are essentially special purpose institutional knowledge management systems.
Example RN Systems Profiles (Harvard) VIVO (VIVO Consortium, led by Florida) Loki (Iowa) SciVal Experts (commercial – Elsevier) A number of others
A Sample Profiles Page
A Sample VIVO Page
A Sample Loki Page
Some Loki Project History Originally conceived as an electronic research interest “brochure” for Research Week 2006 Scaled quickly into something more permanent –Publications –Grants –Biosketches
Some Loki Project History Auto population of PubMed citations Integration with Sponsored Programs funding database Integration with HR database Auto population of NIH Funding Opportunity Announcements
Some Loki Project History Integration with eCV project –Digital Measures’ ActivityInsight Auto population of Web of Knowledge citations Out-bound linkage to –PubMed and PubMed Central –Web of Science –Publishers (via DOIs)
Loki Demographics 2011 College or Org UnitN Medicine682 Other Units130 Liberal Arts and Sciences90 Nursing68 Dentistry59 Pharmacy50 Public Health43 University Hospitals25 Engineering23 ICTS13
Current Loki Demographics
Target Demographics All faculty and interested staff The campus eCV project is serving as the catalyst for this, with PubMed/WoK and DSP data serving as a major point of work avoidance for faculty The campus institutional repository may come to play a role here as well
Why Bother with VIVO? Words in a profile are just sequences of characters carrying no meaning –Try asking Google Scholar what grant funded a given hit… With structure and relationship comes meaning, aka semantics –Enter the Semantic Web!
More… Science –Looking through the current concepts of publication and grant to the science being done Organizational Context –Aggregate concepts as well as the investigators that comprise them Labs, courses, centers, …
More… Information –We have more information available to us about our graduate students (via Facebook and Twitter) than about our (potential) colleagues and collaborators
Connecting the Dots The real challenge here is translation of information already in existence in scattered sources –Research networking tools –Citation databases (e.g., PubMED) –Award databases (e.g., NIH Reporter) –Curated archives (e.g., GenBank) –Locked up in text (the research literature)
Swanson’s bibliographic linkage 1986: "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in Biology and Medicine 30(1): 7-18.
Swanson’s bibliographic linkage 1986: "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in Biology and Medicine 30(1): : "Two medical literatures that are logically but not bibliographically connected." Journal of the American Society of Information Science 38(4): : "Migraine and magnesium: Eleven neglected connections." Perspectives in Biology and Medicine 31(4): Connecting to the Science
Linked Open Data Architecture
Linked Open Data Appeal Models –Low-hanging fruit: mapping our own data –The big payoff: level 5 LOD and deep questions “Which investigators are studying genes implicated in breast cancer?” –The inference chains that are possible: Loki – MEDLINE – RefSeq – GenBank
Linked Open Data Appeal Independence with equivalency –Build out a Loki ontology –Low-hanging fruit: RDF triple generation using D2R –Generate ontological equivalences to VIVO, etc.
Linked Open Data Appeal Addressing conceptual dissonance –The VIVO concept of investigator maps to two related, but distinct, Loki concepts Faculty Researcher –Who’s “right?” It’s a multi-ontology world – all semantics are relative
Looking Beyond Research Networking
Supporting the Research Lifecycle Project conceptualization –“Framing the problem” ala Goodman –Literature review Funding opportunity identification –NIH FOA alerts Team identification/formation –RN tool stock-in-trade
Supporting the Research Lifecycle Proposal preparation –Biosketch management Research process support –If wikis are the answer, what was the question? Outcomes dissemination and curation –Institutional repositories, Dataverse, etc.
Current Approaches to Team Identification Survey the target community –Suffers from issues of scale and detection Quantitatively analyze a surrogate information source –Publication/Grant co-authorship –Temporally offset from actual collaboration –Only the ‘winners’ are detected –Serious information loss re true expertise
Some Early Data on CTSA Consortium Collaboration Org.CornellNWOHSUUCSFFlaIowa Cornell NW0 OHSU066 UCSF Fla Iowa Inter-Institutional co-authorship pair counts
Social Networking Linkages Holly comes across the new service out of LinkedIn Labs, visualization of your LinkedIn connections –http://inmaps.linkedinlabs.comhttp://inmaps.linkedinlabs.com Holly relates this coolness to Dave, who can’t resist poking about to see if he can scrape the data Having done so, he twists arms of selected colleagues to cough up their maps
Phase 1 Acquisition of graph structure –Nodes, edges, coordinates, cluster membership Acquisition of node characteristics –Person name, URL, public ID
Pattuelli’s Spectrum of Relationships (2012) RN Tools Linked In
Pattuelli’s Spectrum of Relationships (2012) Ontologies used –foaf (Friend of a Friend) –rel (Relationship) –mo (Music) Echos of Trigg’s link taxonomy –Trigg, R Network-Based Approach to Text Handling for the Online Scientific Community. Ph.D. dissertation, Department of Computer Science, University of Maryland, technical report TR-1346
Observations N = 5 ! LinkedIn expertise endorsements are an ad hoc folksonomy –Melding this with the typically controlled vocabulary of the research networking tools should prove interesting –These characteristics don’t show up in the RN meta-data
Questions? Thanks to my co-authors and the Research Networking Affinity Group Supported in part by NIH grants 2 UL1 TR and UL1 RR024979