Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics.

Similar presentations


Presentation on theme: "Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics."— Presentation transcript:

1 Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics and Computer Science, University of Craiova, ROMANIA {mihaiug, mirelc,nikyc}@central.ucv.ro,i iancu@yahoo.com

2 Introduction In this paper we show how the Knowledge Grid infrastructure can be used to implement a distributed algorithm for mining fuzzy association rules from distributed databases over a Grid network. Grid network FUZZY MINING +

3 Outline Knowledge Grid services Distributed fuzzy association rules mining Distributed problem definition The distributed algorithm Rules mining implementation over the Grid Conclusion

4 Knowledge Grid Services-1 The Knowledge Grid ([4], [5], [6]) defines an integrating architecture for distributed data mining and knowledge discovery. It uses basic grid services to build specific knowledge services. the Core K-grid layer - offers services directly implemented on the top of generic grid services; the High level K-grid layer - is used to describe, develop and execute distributed knowledge discovery computations;

5 Knowledge Grid Services-2 Knowledge directory service (KDS). This service extends the basic Globus MDS service and it is responsible for maintaining a description of all the data and tools used in the Knowledge Grid. it is used metadata information stored in a Knowledge Metadata Repository (KMR). The Knowledge Base Repository (KBR) is used to maintain discovered knowledge. Another important repository is the Knowledge Execution Plan Repository (KEPR). It store the execution plans of data mining processes. Resource allocation and execution management service (RAEMS). These services are used to find best mapping between an execution plan and available resources,with the goal of satisfying the application requirements.

6 Knowledge Grid Services-2 Data Access Service (DAS). This service is responsible for the search, selection (data search services), extraction,transformation and delivery (data extraction service) of data to be mined. Tools and algorithms access service (TASS). This service is responsible for the search, selection, and downloading of data mining tools and algorithms. Execution plan management service (EPMS). This service is a semi-automatic tool that takes data and programs selected by user, and generate a set of different,possible plans that meet user, data and algorithms requirements and constrains. Results presentation service (RPS). This service specifies how to generate, present and visualize the models extracted.

7 Distributed fuzzy association rules mining-1 DB = {t1,..., tn} I = {i 1,..., i m } Ex: I = { Age, Income, Weight }

8 Distributed fuzzy association rules mining-2 For example, we can take into onsideration for the attribute Weight the following three fuzzy sets: ”thin”,”middle” and ”fat”. F weigth = { thin, middle, fat }

9 Distributed fuzzy association rules mining-3 〈 X,Fx 〉 = 〈 {Age, Income}, {young, high} 〉

10 Distributed fuzzy association rules mining-4 X ={Age, Income}, Y = {Weight}, F X = { middle, high }, F Y = { fat } “ If Age is middle and Income is high then Weight is fat ” 〈 X,Fx 〉 = > 〈 Y,F Y 〉 〈 {Age, Income}, {middle, high} 〉 ⇒〈 {Weight}, {fat} 〉

11 Distributed fuzzy association rules mining-4 T1= 〈 {Age, Income}, {middle, high} 〉 = 〈 {Age, Income}, { 0.5, 1 } 〉 T2= 〈 {Age, Income}, {middle, high} 〉 = 〈 {Age, Income}, { 1, 1 } 〉 The fuzzy support value of itemset 〈 X,Fx 〉 = 〈 {Age, Income}, {middle, high} 〉 0.5 * 1 + 1 * 1 = 1.5 / 2 = 0.75

12 Distributed fuzzy association rules mining-5 enough supporthigh confidence An association rule is considered as interesting if it has enough support and high confidence value. This association rule can be encountered under the name strong rule.

13 Distributed fuzzy association rules mining-6 The problem of sequential mining of fuzzy association rules can be decomposed in two subproblems: 1.find all large fuzzy itemsets. 2. generate the fuzzy association rules from the large fuzzy itemsets founded.

14 Example ageweight 1540 3070 ageweight youngold thin fat 100.50 0 1 1*0.50*0.5 〈 {Age, Weight}, {young, thin} 〉 => 1*0.5 + 0*0.5 1*00*1 〈 {Age, Weight}, {young, fat} 〉 => 1*0 +0*1 0*0.50.5*0.5 〈 {Age, Weight}, {old, thin} 〉 = > 0*0.5 +0.5*0.5 0*00.5*1 〈 {Age, Weight}, {old, fat} 〉 = > 0*0 +0.5*1 Support count > Minsup large fuzzy itemsets

15 Distributed problem definition-1 Let DB = { DB 1,DB 2,...,DB n } be a distributed database over n sites S 1, S 2,..., S n. DB 2 DB 1 …….DB n …..

16 Distributed problem definition-2

17 Distributed problem definition-3

18 Distributed problem definition-4

19 Distributed problem definition-5 Distributed Mining Fuzzy Association Rules DB Given the set of items I, the distributed database DB = I {DB1,DB2,...,DBn}, the fuzzy sets associated with attributes from I, minsup the minimum support threshold (minsup) and the minimum minconf confidence threshold (minconf), extract all global fuzzy association rules. 1.find all global large fuzzy itemsets. 2. generate the global fuzzy association rules from the global large fuzzy itemsets founded.

20

21 Fuzzy Count Distribution Algorithm …………. First generated L1 globally large fuzzy 1-itemsets L(1). local large fuzzy 1-itemsets local large fuzzy 1-itemsets local large fuzzy 1-itemsets globally large fuzzy 1-itemsets L(1). global large candidates 1-itemsets CA(1). CA(k) = Fuzzy_Apriori_Gen(L(k−1)).

22 Rules mining implementation over the Grid-1 Distributed Rules Mining Scenario

23 Rules mining implementation over the Grid-2 In order to present the implementation of this process in a Grid network we shall consider that: NodeAthe database DB is stored on K-grid node NodeA. NodeSthe tools needed for mining association rules (the partitioner P, mining frequent itemsets tool and association rules extractor) are available as multiplatform executables on K-grid node NodeS. NodeUthe results will be stored into the Knowledge Base Repository (KBR) on NodeU.

24 Rules mining implementation over the Grid-3 Let’s suppose that a Grid User (GU) needs to extract all association rules from database DB using tools available on K-grid node NodeS. Step 1.The GU starts the search of computational resources for executing the data mining process from his K-grid node NodeU. In order to locate the computation resources needed to execute the mining process the KDS (Knowledge Discovery Service) will be used.

25 Rules mining implementation over the Grid-4 Step 2. The GU builds an execution plan for the data mining task, specifying strategies for tools and data movements.The execution plan is constructed by using the EPMS (Execution Plan Management Service). This plan will be stored into local KEPR (Knowledge Execution Plan Repository).EPMSKEPR Step 3. The GU sends the execution plan to RAEMS (Resource Allocation and Execution Management ervice) which starts the application.RAEMS Step 4. The GU visualizes and evaluates the result of computation stored in KBR by means of the RPS (Result Presentation Service) tools.KBRRPS

26 Conclusion In this article, it is proposed an implementation of a distributed algorithm for mining fuzzy association rules from distributed databases into a Knowledge Grid environment. reduction of computationsThe proposed algorithm uses some properties of global large fuzzy itemsets and local large fuzzy itemsets, reduction of computations made heavily relying on them.

27

28 Knowledge Grid Services-2


Download ppt "Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics."

Similar presentations


Ads by Google