By Morris Wright, Brian Chapman and Ryan Caplet
Recap Crawler-Based Search Engine Limited to a subset of Uconn’s School of Engineering Websites Roughly 3000 Pages Resources: Web server and MySQL servers provided by ECS Database holds ~5000 keywords Languages Used: HTML, PHP, SQL, Perl
Features Runs searches on multiple words Ex. “uconn sports” runs a search on the keyword “uconn” and “sports” Ranks results based on frequency and relevance of words Ex. For “uconn sports”, websites containing both words will be ranked higher than websites containing one or the other Despite the size of the database, it takes a matter of seconds to display the results
Interface – Main Page Main page contains a text box used for search function
Interface – Result Page Search box List of URLs # of Results Part of URL used as Title instead of the website’s Title
Interface – Result Page 10 Results per page Page number
phpMyAdmin Database
phpMyAdmin Indexer
Some Examples – help Total Results: 30 1) Cseoveriew 2) Dewolf06 3) Engrvote 4) Doctucapturehonors 5) Endicottstory
Some Examples – me Total Results: 9 1) Crosby07 2) Dewolfjeffersretiring 3) Endicottstory 4) Andreaprof 5) Careerfairfall07
Some Examples – now Total Results: 12 1) Briefhistory 2) Endicottstory 3) Careerfairfall07 4) Biogridnews 5) Alumniprofilescott
Some Examples – help me now Total Results: 40 1) Endicottstory 2) Crosby07 3) Alumniprofilescott 4) Dewolfjerrersretiring 5) Docstucapturehonors
Behind the scene functions Does not show duplicate URLs Checks for capitals and makes lowercase Removes any symbols “, ‘, ?, etc…. QuickSort algorithm used to sort Frequencies from largest to smallest
Known Bugs When the user types in “ or ‘, no results display but slashes (/) appear in the search box. The “previous page” button becomes active and when pressed, more slashes appear in the search box.
End Website: