Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Electricity Based External Similarity of Categorical Attributes.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Electricity Based External Similarity of Categorical Attributes."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Electricity Based External Similarity of Categorical Attributes Advisor : Dr. Hsu Presenter : Zih-Hui Lin Author :Christopher R. Palmer 1 and Christos Faloutsos 2 PAKDD

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2  Motivation  Objective  Algorithm  Experiments  Conclusions Outline

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  There is a great deal of categorical data stored in databases but it is difficult to define similarity between categorical values.  The only previously proposed external similarity function is ad-hoc while REP is theoretically grounded.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective  The goal of this research is effectively use the previously theory.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Introduction I=V/R;C=1/R P(x→* y/S)= ;C(x,y)=w(x,y); R(x,y)=1/w(x,y) w(x,y)/Cy C 為傳導性,假設為權重值, 傳導性愈高值愈大。 值愈高, x 到 y 再回到 x 的機率愈 高。 Escape probability is defined as the probability that a walk started from x will reach S before it returns to x. xy

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 REP algorithm s 1. Adding sink nodes to a graph 2. S REP is Symmetric 3. S REP as Electrical Currents 4. Kirchhoff Relaxation Algorithm for Voltages 1 ……………………. S1S1 S2S2 SiSi S…S… 0 5. Distance function C G0 (x) + w(x,s)=total Old: New: X X X X X X X S…S… S…S… ……

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Experiments-Clustering

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Experiments-Classification Table 1. Error rates show REP has classification error similar to C4.5 and better than NN with hamming distance

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Experiments -Sensitivity to sink _ p and scalability 0.85

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Conclusions  We used this node similarity function to provide an external similarity function called S REP. ─ better than the best existing external distance function, ─ built upon a theoretical foundation (the existing approach is not), ─ allows cross attribute similarity computations which allows ─ excellent nearest neighbour classification. ─ provides excellent scalability with input size, and

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 My opinion Advantage: 結合其他領域知識 ….. Disadvantage: Apply: 自動建立概念階層 ……


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Electricity Based External Similarity of Categorical Attributes."

Similar presentations


Ads by Google