Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.

Slides:



Advertisements
Similar presentations
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Validating Transliteration Hypotheses Using the Web: Web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discovering Leaders from Community Actions Presenter : Wu, Jia-Hao Authors : Amit Goyal, Francesco Bonchi,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Quality evaluation of product reviews using an information.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A new student performance analysing system using knowledge discovery in higher educational databases.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 TANGENT: A Novel, “Surprise-me”, Recommendation Algorithm.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Web usage mining: extracting unexpected periods from web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology CONTOUR: an efficient algorithm for discovering discriminating.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Wireless Sensor Network Wireless Sensor Network Based.
Mining Advisor-Advisee Relatio nships from Research Publication Networks KDD2010 报告人:徐晓旻.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Concept similarity in Formal Concept Analysis-An information.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning Phonetic Similarity for Matching Named Entity.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. How valuable is medical social media data? Content analysis of the medical web Presenter :Tsai Tzung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Novel Density-Based Clustering Framework by Using Level.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Optimal Linear Boosting of a Pair of Classifiers.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A text mining approach on automatic generation of web.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Extreme Visualization: Squeezing a Billion Records into a Million Pixels Presenter : Jiang-Shan Wang.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Regularization in Matrix Relevance Learning Petra Schneider,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 TIARA: A Visual Exploratory Text Analytic System Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Modeling Semantic Similarities in Multiple Maps Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Discovering Interesting Usage Patterns in Text Collections:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 f-information measures in medical image registration Presenter.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication Networks Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu SIGKDD, 2010 Presented by Hung-Yi Cai 2010/12/29

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines  Motivation  Objectives  Previous study  Methodology ─ Problem Formulation ─ Assumption and Framework ─ Preprocessing ─ TPFG Model ─ Model Learning  Experiments  Conclusions  Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  Information network contains abundant knowledge about relationships among people or entities.  Discovery of those relationships can benefit many interesting applications such as expert finding and research community analysis.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objectives TPFG  To propose a time-constrained probabilistic factor graph model (TPFG), which takes a research publication network as input and models the advisor-advisee relationship mining problem using a jointly likelihood objective function and further to design an efficient learning algorithm to optimize the objective function.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Previous study  This work is different from the existing study in Relation Mining and Relational Learning. ─ Relation Mining ─ Relation Mining : the study mainly employ text mining and language processing technique on text data and structured data including web pages, user profiles and corpus of literature. ─ Relational Learning ─ Relational Learning : the study refers to the classification when objects or entities are presented in multiple relations. 5

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Problem Formulation Problem Formulation Assumption and Framework Preprocessing TPFG Model Model Learning 6

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Problem Formulation 7

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Assumption and Framework  Assumption 1 based on the commonsense knowledge about advisor-advisee relationships.  Assumption 2 determines that all the authors in the network have a strict order defined by the possible advising relationship. 8

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Preprocessing  The purpose of preprocessing is to generate the candidate graph H′ and reduce the search space. 9

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Preprocessing  Then we have the following rule. ─ Author aj is not considered to be a i ’s advisor if one of the following conditions holds: 10

Intelligent Database Systems Lab N.Y.U.S.T. I. M. TPFG Model  By modeling the network as a whole, this step can incorporate both structure information and temporal constraint and better analyze the relationship among individual links. 11

Intelligent Database Systems Lab N.Y.U.S.T. I. M. TPFG Model  The graph is composed of two kinds of nodes: variable nodes and function nodes. 12

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Model Learning Eq. (10)  To maximize the objective function and compute the ranking score along with each edge in the candidate graph H′, this step need to infer the marginal maximal joint probability on TPFG, according to Eq. (10).  Sum-product + junction tree  Sum-product + junction tree. There is a general algorithm called sum-product to compute marginal function on a factor graph based on message passing. 13

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Model Learning  New TPFG Inference Algorithm  New TPFG Inference Algorithm. The original sum-product or max-sum algorithm meet with difficulty since it requires that each node needs to wait for all-but-one message to arrive. 14

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Model Learning  After the two phases of message propagation, we can collect the two messages on any edge and obtain the marginal function.  The improved message propagation is still separated into two phases. sent i ─ Phase 1 : the messages sent i which passed from one to their ascendants are generated in a similar order as before. recv i ─ Phase 2 : messages returned from ascendants recv i are stored in each node. 15

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 16

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Experiment Step 17 Data Sets DBLP : The data set consists of 654,628 authors and 1,076,946 publications with time provided from 1970 to Method Sum-Product + Junction Tree (JuncT) Loopy Belief Propagation (LBP) Independent Maxima (IndMAX) SVM RULE Evaluation Aspects ROC curve

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Accuracy : Effect of rules in TPFG ─ Using R3 as filtering rules and YEAR2 as graduation year estimation method. 18

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Accuracy : Effect of network structure ─ Using DFS with a bounded maximal depth d from the given set of nodes, denoted as DFS=d, we can closures with controlled depth for a given set of authors to test. 19

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Accuracy : Effect of training data 20

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Accuracy : Case study ─ Finding that TPFG can discover some interesting relations beyond the “ground truth” from single source. 21

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Scalability Performance 22

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Application : Visualization of genealogy 23

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Application : Expert finding and Bole search 24

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 25 Conclusions  This paper studied the mining of advisor-advisee relationships from a research publication network as an attempt to discover hidden semantic knowledge in information networks.  Proposing a Time-constraint Probabilistic Factor Graph (TPFG) model to integrate local intuitive features in the network and results on the DBLP data sets demonstrate the effectiveness of the proposed approach.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 26 Comments  Advantages ─ The TPFG model can mining relationship between advisor and advisee from the research publication network.  Applications ─ Relationship Mining