Presentation on theme: "Semantic Browser LSDIS (Large Scale Distributed Information Systems) Lab. Bilal Gonen M.Sc. in Computer Science University of Georgia"— Presentation transcript:
Semantic Browser LSDIS (Large Scale Distributed Information Systems) Lab. Bilal Gonen M.Sc. in Computer Science University of Georgia firstname.lastname@example.org
Semantic Browser tool enables the users to traverse among the semantically connected documents easily. The documents are connected by using the relationships, such as “causes”, “adjacent to ”, “produces”…etc. Semantic Browser
PubMed dataset is used (48,252 documents). MESH terms in these documents are annotated with the UMLS ontology. Our ontology that we generated from the UMLS, has; 135 classes & 49 relationships in schema level. 21,945 entity instances in instance level. Dataset
Lymphocytosis Cancers diseases Gonads Nafenopin ultraviolet rays non-melanoma melanoma Skin cancer Magnetics Flu blood cancer Khellin Hyptis causes How can we utilize this knowledgebase to enable user to traverse among the documents?
ultraviolet raysMelanoma causes Doc-to-Doc By using the information “ultraviolet rays causes melanome”, user can traverse to other files which contains the term “melanoma”
JSP (Java Server Page) Java Script AJAX user Lucene Index for documents PubMed dataset Ontology SemDis API Lucene indexing is used to index the documents with the 21,945 MESH terms when they occur in the documents. User Interface (HTML page) Because, the request is to get the documents in which the requested term appears, so JSP calls the Lucene methods in the Java class. The advantage of the AJAX technology is to send and receive only needed information between the client and server. Thus, instead of reloading the whole page, only the response received from the server is embedded into HTML page. To begin traversing, the user can type any MESH term or one of its synonyms in the user interface. Now, user hovers on any MESH term in the document’s abstract. Because, the request is to get the types of the MESH term, so JSP calls the corresponding SemDis method in the Java class. SemDis API gets the types of the instance term from the ontology. request response keyword related documents Built in LSDIS Lab. This API is used to process the triples in the ontology. Contains 135 classes and 49 relationships in schema level. And 21,945 entity instances in the instance level Contains 48,252 documents List of the documents are returned from the Lucene index. The user clicks on one of the file names, and the abstract of that file is embedded into HTML page. The MESH term is sent to server as a request to get its types from the ontology by using the SemDis API. By hovering the mouse over the class name returned from server, the user makes another request to SemDis API to get the relationships outgoing from that class. And these processes continue in this way, until the user gets the list of file names to traverse to. Because we also used the synonyms of the 21,945 MESH terms, therefore we used ~104,000 terms to index the documents.