Presentation is loading. Please wait.

Presentation is loading. Please wait.

國立雲林科技大學 National Yunlin University of Science and Technology Mining Generalized Associations of Semantic Relations from Textual Web Content Tao Jiang,

Similar presentations


Presentation on theme: "國立雲林科技大學 National Yunlin University of Science and Technology Mining Generalized Associations of Semantic Relations from Textual Web Content Tao Jiang,"— Presentation transcript:

1 國立雲林科技大學 National Yunlin University of Science and Technology Mining Generalized Associations of Semantic Relations from Textual Web Content Tao Jiang, Ah-Hwee Tan, Senior Member, IEEE, and Ke Wang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 19, NO. 2, 2007. Presenter : Wei-Shen Tai Advisor : Professor Chung-Chian Hsu 2007/1/10

2 N.Y.U.S.T. I. M. Outline Introduction Resource Description Framework and RDF Schema Semantic relation extraction Mining generalized association form RDF metadata Experiments Conclusion Comments

3 N.Y.U.S.T. I. M. Motivation Text mining problem As terms are treated as individual items in such simplistic representations, terms lose their semantic relations and texts lose their original meanings. Two short text documents with different meanings can be represented in a similar bag of keywords.

4 N.Y.U.S.T. I. M. Objective Semantic relation associations An intermediate representation that expresses the semantic relations between the concepts in texts.

5 N.Y.U.S.T. I. M. Major processes Semantic relation extraction The extracted relations are encoded in RDF statements. Semantic relation associations Meaningful and detailed patterns can be discovered from text using the conceptual graph representation.

6 N.Y.U.S.T. I. M. Resource Description Framework and RDF Schema Resource Description Framework (RDF) For describing and interchanging semantic metadata. RDF statements {France, Defeat, Italy, World Cup, Quarter Final} RDF Schema Defines RDF vocabularies for constructing RDF statements.

7 N.Y.U.S.T. I. M. Term Taxonomy Construction Term similarity measure Incremental term taxonomy construction

8 N.Y.U.S.T. I. M. RDF model RDF vocabulary ={ ,P,H, domain, range}, where  ={ a, b, c, d, e, f, ab, cd, ef, cdef}, P= {p}, domain = { a, b, ab}, and range= {c, d, e, f, cd, ef, cdef} Generalized relation hierarchy e.g. {, } is a relationset and it is also a generalized relationset of {, }.

9 N.Y.U.S.T. I. M. Overgeneralization Example {, }, {, } {, }, {, } Definition A frequent relationset X is overgeneralized if there exists a specialized relationset Y of X with supp(X) = supp(Y).

10 N.Y.U.S.T. I. M. Overgeneralization Reduction Node is a unique generalization closure If a closure and its children have the same support, this closure is not closed and can be pruned. Such a nonclosed closure is prune by replacing it with the union of its equal-support children.

11 N.Y.U.S.T. I. M. GP (Generalized Pattern)-Close Algorithm GP-Close Initializes the enumeration tree to contain only the root closure. Closure-Enumeration Starting from the root closure of the empty set, the closure enumeration process recursively traverses the closure enumeration tree to discover closed generalization closures.

12 N.Y.U.S.T. I. M. Experiments Data sets The online database of the International Policy Institute for Counter-Terrorism (ICT) including suicide bombing (ICT-SB) and car bombing (ICT-CB) documents. Analysis of Patterns 71.8 percent (56 out of 78) of the patterns are commonsense patterns already known by people. 12.8 percent (Ten out of 78 ) of the patterns are identified as previously unknown and not useful. 15.4 percent (12 out of 78) of the patterns are previously unknown and potentially useful.

13 N.Y.U.S.T. I. M. Conclusions Semantic relation extraction Discovering knowledge from free-form textual Web content. GP-Close algorithm Based on mining closed generalization closures. Substantially reduce the pattern redundancy and perform.

14 N.Y.U.S.T. I. M. Comments Advantage A novel idea for semantic relation association extraction. GP-Close is applicable for reducing pattern search space. Drawback Example depiction cannot keep consistent in data. Diagrams of child-closure pruning and sub-tree pruning make reader confuse. Application Data mining applications in semantic relation association.


Download ppt "國立雲林科技大學 National Yunlin University of Science and Technology Mining Generalized Associations of Semantic Relations from Textual Web Content Tao Jiang,"

Similar presentations


Ads by Google