Presentation is loading. Please wait.

Presentation is loading. Please wait.

GROUP 00000011 PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.

Similar presentations


Presentation on theme: "GROUP 00000011 PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston."— Presentation transcript:

1 GROUP 00000011 PresentsPresents

2 WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston North 2009 Software Engineering C Semester Two Massey University - Palmerston North 2009

3 About the team Amir Hoshang Kioumars Amir Hoshang Kioumars Major: Software Engineering Major: Software Engineering Position: Team Leader Position: Team Leader Chagitha Ranhotigamage Chagitha Ranhotigamage Major: Computer System Engineering Major: Computer System Engineering Position: Team Member Position: Team Member Jeffrey Hamilton Jeffrey Hamilton Major: Computer Science Major: Computer Science Position: Team Member Position: Team Member Reference: Wikipedia

4 The purpose of the project Design and implement a network analysis tool. The end product would be a desktop application that can be used to display and visualise the relationships of web sites to other web sites and display as a visualize graph. Reference: Wikipedia

5 Project Requirements Graph must be animated. It should be redrawn if more sites and links are discovered. Graph must be animated. It should be redrawn if more sites and links are discovered. The application should run multi-threaded. The application should run multi-threaded. The graph can be stored in a database, and restored for analysis. The graph can be stored in a database, and restored for analysis. The customer should be able to control look and feel. The customer should be able to control look and feel. The design should be modular, and easy to extend. For example: export/import of graphs, plug-in other views.” The design should be modular, and easy to extend. For example: export/import of graphs, plug-in other views.”

6 What was our plan ? Understanding the needs and establishing the requirements Understanding the needs and establishing the requirements Select a suitable language which all the team members were familiar with Select a suitable language which all the team members were familiar with Select a graph package that suits our needs Select a graph package that suits our needs Prioritise the tasks Prioritise the tasks Design a project plan in terms of our time and resources Design a project plan in terms of our time and resources Release a version of software after finishing each milestone Release a version of software after finishing each milestone To keep everything as simple as possible To keep everything as simple as possible

7 What was the plan result ? An accurate project plan An accurate project plan The dead lines we set were achieved The dead lines we set were achieved We released a new version at the conclusion of each milestone We released a new version at the conclusion of each milestone We update the project plan by highlighting the tasks for better monitoring the process We update the project plan by highlighting the tasks for better monitoring the process

8 What is the Web Crawler ? Given a URL address, it searches through the page and finds all the links on the page. It then follows all the links it found, and continues to do this until it reaches a pre- defined depth. Given a URL address, it searches through the page and finds all the links on the page. It then follows all the links it found, and continues to do this until it reaches a pre- defined depth. As it searches it will display all the links as nodes on a graph. As it searches it will display all the links as nodes on a graph.

9 Why is the Web Crawler useful ? Can be used to analyse the changes that have occurred to the links from a page over time. Can be used to analyse the changes that have occurred to the links from a page over time. It would also be useful to investigate how many links there are between two separate pages. It would also be useful to investigate how many links there are between two separate pages. Could be used for working out degrees of freedom (on social sites). Could be used for working out degrees of freedom (on social sites).

10 Animating the graph

11 Optional Extras Nodes can be labeled in Five different ways Nodes can be labeled in Five different ways Links can be filtered according to their type Links can be filtered according to their type Look and Feel of the User Interface can be changed Look and Feel of the User Interface can be changed Graph can be exported as images Graph can be exported as images Depth of the search is user definable Depth of the search is user definable

12 Multi-Threading Program is threaded, preventing the whole application becoming frozen when a single part fails. For Example, when a link is invalid, or the internet connection is lost. Program is threaded, preventing the whole application becoming frozen when a single part fails. For Example, when a link is invalid, or the internet connection is lost.

13 Storing in a database Program stores the links into a database Program stores the links into a database Database has the same name as project Database has the same name as project Database uses Apache Derby Database uses Apache Derby Database is in a SQL standard format which can be queried by any SQL support program (mySQL, Access and…) Database is in a SQL standard format which can be queried by any SQL support program (mySQL, Access and…) Database contents can be read as a static graph Database contents can be read as a static graph

14 Why our product is special and what is the features of our program ?

15 Simpler is better Easy to use Easy to use Open source Open source Free Free Platform independent Platform independent No need for installation No need for installation Stable Stable Standard user interface Standard user interface Easy to maintain Easy to maintain Uses simple algorithms Uses simple algorithms Works with proxy Works with proxy Uses an SQL database Uses an SQL database We like it We like it

16 How we managed the project ?

17 By breaking down the complexities and defining the tasks for each milestone or version or release By breaking down the complexities and defining the tasks for each milestone or version or release

18 Functionality at each milestone: Version 1.0 Version 1.0 Version 2.0 Version 2.0 Version 3.0 Version 3.0 Version 4.0 Version 4.0 Final Release Final Release

19 Our issues policy was: Our issues policy was: Open a new issue Open a new issue Accept issue: max 2 days Accept issue: max 2 days Work on Issue: max 5 days Work on Issue: max 5 days Assign Issue for verification: 1 day Assign Issue for verification: 1 day Verify and close issue: max 2 days Verify and close issue: max 2 days Issues priorities are in terms of its dependencies. For example, if issue is on a important function which share between classes, it would be critical and some issues like GUI and usability will be cosmetic and can be done at any stage. Also before deployment, all the issues should be done. Issue tracking policy

20 Deployment The application can be run as a standalone jar file, and is available to download as a zipped file with the user manual from our website: http://www.webcrawler.host22.com/ The application can be run as a standalone jar file, and is available to download as a zipped file with the user manual from our website: http://www.webcrawler.host22.com/ http://www.webcrawler.host22.com/ Any updates and maintenance that are made to the program will be documented and logged on the page above. Any updates and maintenance that are made to the program will be documented and logged on the page above. As part of the website, there is a step by step example video of a simple search taking place, and how to use the program (this is also linked to directly from the help menu of the application. As part of the website, there is a step by step example video of a simple search taking place, and how to use the program (this is also linked to directly from the help menu of the application.

21 What did we learn ?

22 Always have a plan B (at least) Always have a plan B (at least) Communication is the key Communication is the key Research the problem thoroughly Research the problem thoroughly Better use of issue tracking system Better use of issue tracking system It’s a good reminder to avoid any clashes

23 Dr. Jens Dietrich for his incredible support, help and advice for the duration of this project Dr. Jens Dietrich for his incredible support, help and advice for the duration of this project Dr. Russell Johnson for his friendly help Dr. Russell Johnson for his friendly help And all the guests and students for your attention And all the guests and students for your attention Special thanks to:

24 The End Good luck in your final exams & have a great long holiday


Download ppt "GROUP 00000011 PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston."

Similar presentations


Ads by Google