Progress Report Related work in KM Advisor: Prof. Hahn-Ming Lee Prof. Jan-Ming Ho Reporter: Shou-Wei Ho Chung-Hung Lin 2009.08.31 1.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Automatic Timeline Generation from News Articles Josh Taylor and Jessica Jenkins.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Author linkage Vetle I. Torvik. PubMed/MEDLINE is topic-driven Articles in MEDLINE are assigned medical subject headings (MeSH) PubMed converts a free.
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
Large-Scale Entity-Based Online Social Network Profile Linkage.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Modern Language Association (MLA) International Bibliography Hosted by Gale Cengage Welcome to our Guided Tour Tour takes about 7 minutes. The show will.
Mining External Resources for Biomedical IE Why, How, What Malvina Nissim
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Person Name Disambiguation by Bootstrapping Presenter: Lijie Zhang Advisor: Weining Zhang.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Mining Academic Community Jan-Ming Ho hohoiis.sinica.edu.tw C omputer S ystem and C ommunication L ab I nstitute of I nformation S cience Academia Sinica.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Distributed Search over the Hidden Web Hierarchical Database Sampling and Selection Panagiotis G. Ipeirotis Luis Gravano Computer Science Department Columbia.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
1 Web Query Classification Query Classification Task: map queries to concepts Application: Paid advertisement 问题:百度 /Google 怎么赚钱?
Search Engine Optimization. What is SEO? Search engine optimization (SEO) is the process of improving the visibility of a website or a web page in search.
Yoonjung Choi.  The Knowledge Discovery in Databases (KDD) is concerned with the development of methods and techniques for making sense of data.  One.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Disambiguation of References to Individuals Levon Lloyd (State University of New York) Varun Bhagwan, Daniel Gruhl (IBM Research Center) Varun Bhagwan,
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
Custom driven scientific information extraction from digital libraries using integrated text mining services Betim Çiço, Adrian Besimi, Visar Shehu 14th.
Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs ReporterHsan-Yu Lin.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Basics of Information Retrieval Lillian N. Cassel Some of these slides are taken or adapted from Source:
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
 CiteGraph: A Citation Network System for MEDLINE Articles and Analysis Qing Zhang 1,2, Hong Yu 1,3 1 University of Massachusetts Medical School, Worcester,
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
P-Rank: A Comprehensive Structural Similarity Measure over Information Networks CIKM’ 09 November 3 rd, 2009, Hong Kong Peixiang Zhao, Jiawei Han, Yizhou.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
The Smarter Balanced Assessment Consortium Finding Resources in the Digital Library June, 2014.
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Author Name Disambiguation in Medline Vetle I. Torvik and Neil R. Smalheiser August 31, 2006.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Learning Phonetic Similarity for Matching Named Entity Translations and Mining New Translations Wai Lam Ruizhang Huang Pik-Shan Cheung Department of Systems.
Date: 2013/10/23 Author: Salvatore Oriando, Francesco Pizzolon, Gabriele Tolomei Source: WWW’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang SEED:A Framework.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Translation of Web Queries Using Anchor Text Mining Advisor.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
Name Disambiguation in Digital Libraries Tan Yee Fan 2005 October 19 WING Group Meeting.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Computational Linguistics Courses Experiment Test.
Extracting Opinion Topics for Chinese Opinions using Dependence Grammar Guang Qiu, Kangmiao Liu, Jiajun Bu*, Chun Chen, Zhiming Kang Reporter: Chia-Ying.
1 CS 430: Information Discovery Lecture 8 Collection-Level Metadata Vector Methods.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
WePS2 Attribute Extraction Task Sekine and Artiles WWW 2009 Workshop.
CS 540 Database Management Systems Web Data Management some slides are due to Kevin Chang 1.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
A Comparative Study of the Publication on Alzheimer’s Disease
Jian Wang Assistant Professor Science Based Business Program LIACS, Leiden University
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
Inventor Name Disambiguation
How to identify a scholarly book
Introduction to Information Retrieval
Panagiotis G. Ipeirotis Luis Gravano
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
THE TOPICS AND TITLES OF RESEARCH
Presentation transcript:

Progress Report Related work in KM Advisor: Prof. Hahn-Ming Lee Prof. Jan-Ming Ho Reporter: Shou-Wei Ho Chung-Hung Lin

Related work in KM (Knowledge Management) 2

Problems in searching Chinese( 威達 ) name Only Chinese Corpus 3

Challenges in Chinese name translation Many pronunciation rules in different areas – 陳  Chen (Taiwan) 陳  Tsun (Hong Kong) 陳  Tan (Fukien) Some additional words exist. –Ex: 黃光明 (Kwang-Ming Frank Hwang) Ex: 張韻詩 (Jane Win-Shih Liu) 4

CMU Professor ? Guitar Player and Singer Ambiguous pages in the WWW( 坤彥 ) 5

Anchor text mapping A. Personal main page B. NVIDIA web site page Search Name: Bill Mark 6

CRE: Why do we extract information from publication list web page? ( 水石 ) Publication list page is an important resource for many value-added applications, such as citation analysis and academic network analysis. What could we get from publication list pages? –Some up-to-date literatures before they are formally published –Some reference materials, such as slides and talks. 7

An automatic extractor Structure Data Extract Web Page Citation String Detect 3 relationships cont. 8

Citation extracting 9

Authorship Disambiguation( 建毅 ) Prof. A’Prof. A ?? 10

Detect 3 relationships(COI) 1. Teacher-Student Prof. A Prof. B Student C 11

2. Co-author Detect 3 relationships cont. Prof. S Prof. A … Prof. B 12

3. Colleagues Detect 3 relationships cont. Prof. W Prof. A … Prof. E 13

Mining a Chinese Person’s Name from the English Translation (任明) 14

15

A set of citations with the same author name A cluster is a citation set of an author Grouping Suppose the number of authors is unknown Name Disambiguation( 信璁 ) Problem –Given a set of citations with the same author name, how do we identify which one belongs to whom? Goal –To group the citations into several clusters, so that each cluster represents an author 16

Procedure Coauthor correlation Author information correlation SVM Classify whether a pair of citations is published by the same author Citation ACitation B Title correlation Venue correlation Web correlation Topic correlation A pair of citations 17

Procedure Use classification result to group citations into several clusters –Each cluster contains citations belonging to the same author Grouping If SVM determines two citations are authored by the same person, then they are connected each other 18

Citation Correspondence( 大為 ) Query construction: –A good query If proper records are achieved in digital libraries, good query should get them in search result, at the same time, proper records should have higher ranking. Search result should be small. Citation correspondence: –Find proper records from search result by matching local citation string and records in search result. Field-by-field comparison. –May be not enough due to errors in digital libraries (optional). Metrics: precision, recall, and F-measure. 19

Partial Solution: Abbreviation Matching v1v1 v2v2 Example: CIKM = Conference on Information and Knowledge Management 20

Reviewer Recommendation( 泰良 ) 21

COI in incomplete collaboration Network via social Interaction( 秋宜 ) 22