Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Data Mining System in Java Group Member D91725001 王春笙 D92725002 林俊甫 D92725001 王慧芬.

Similar presentations


Presentation on theme: "Distributed Data Mining System in Java Group Member D91725001 王春笙 D92725002 林俊甫 D92725001 王慧芬."— Presentation transcript:

1 Distributed Data Mining System in Java Group Member D91725001 王春笙 D92725002 林俊甫 D92725001 王慧芬

2 Overview of Project Motivation and goals It is time-consuming to perform multi-layer data- mining over a large data file Joint force to improve performance Several computing power spreading over net Fault tolerant consideration The mining process will be continue despite of server crash

3 Web Server Log files Node Request service module distributed mining Prediction engine Htt p Web Client System Architecture

4 Technological Infrastructure System diagram LAN Server/Coordinator Client ... Mining data chunk

5 Project Timeline

6 Job Distribution Server programming 林俊甫 Client programming 王春笙 GUI programming and Integration 王慧芬

7 Technological Infrastructure System design requirements Transparency Scalability Dynamic join problem Multi-Threads RMI Multicast Socket Redundancy Server crash failure Client crash failure

8 Technological Infrastructure Rationale/justification Data-mining is computing intensive task Speed of web log data generation may so quickly as single computer can’t handle it implement distributed prediction engine have fault tolerance advantage

9 Technological Infrastructure Alternatives considered Fully distributed data mining system Each participant act as peer to peer autonomous node Client/server distributed data mining system The data server act as fixed coordinator

10 Implementation Phase System requirement Hardware 2 or 3 PC with Microsoft Windows platform  1 acts as Server, others act as client as well as redundant server. Software For implementation  J2SE SDK 1.4.1  Eclipse 2.1  Netbeam 3.5.1 For execution  Java web start

11 Implementation Phase Implementation Logic Server/Coordinator Activating at a well-known port, waiting for client connection by threads process. Logging all the connected client information to hash table. Dispatching the designate mining data to clients. Maintain and multicasting the hast table to each client periodically Merging & displaying the results return from clients Detecting the connection status for each client. If a client fail, server performs the backup mechanism and orders backup client to take over failure client’s job.

12 Implementation Phase Implementation Client Once activated, enrolling to server (coordinator) Receiving the hash table broadcasted from server and updating local hash table periodically and the mining data sent from server Perform the data mining execution and return the result to server (Coordinator). Detecting the server connection, if server is not alive, perform the backup mechanism to electing a client acting as backup server.

13 Implementation Phase Failure and backup mechanism Client fail: Server will be informed the connection failure with client. Then, server modifies the connection information in the hash table, finds a client without any designated job in the hash table, and dispatches the unfinished job to the client. Server fail: All clients will be informed the connection failure with server. Since all clients keep all connection information in hash table which is periodically updated from server, after server failed, all clients elect a new server through the same election mechanism. Then, new server broadcasts the result to all clients, and enter server listening state.

14 Implementation Phase Data mining algorithm Using sequential patterns mining algorithm Apriori like Client mining data partition and sent results to coordinator(server) Coordinator receive client mining results,union and validate results by scan all data again Results present as association rules

15 Implementation Phase Installation Server Web log file Server module Client module Client Client module Server module The role of a node in mining process may change

16 Implementation phase Test Component(server, client, UI) unit test System integration test Fault tolerance test Component error Transmission error  Network error  Host error


Download ppt "Distributed Data Mining System in Java Group Member D91725001 王春笙 D92725002 林俊甫 D92725001 王慧芬."

Similar presentations


Ads by Google