Presentation on theme: "A web application for browsing research papers By: Rhea Dookeran 09’"— Presentation transcript:
A web application for browsing research papers By: Rhea Dookeran 09’
Comparison CiteSeer (http://citeseer.ist.psu.edu/citeseer.html)http://citeseer.ist.psu.edu/citeseer.html Focuses on CS/IS literature Positive Allows you to search by through citations, acknowledgments and Google docs. Can assign tag to the document. Negative Hovering shows one abstract at a time. Clicking on the link redirects you to a new page. Abstract and list of citations
Comparison GoogleScholar (http://scholar.google.com/)http://scholar.google.com/ Positive Large corpus Does not cover one specific field of study. Familiar Follows the generic Google Search Engine Results Page (SERP) layout. (Minus ads) Negative Hard to compare papers. No abstract: only shows the line where search term/s appear. Have to click on link to learn more.
Comparison Citeulike (http://www.citeulike.org/)http://www.citeulike.org/ Positive Can add tags to documents and filter by common tags Overall Web 2.0 style Negative Not visually appealing. Must scroll down to see results. Must click on title to learn more (slide menu categories)
Our System’s Key Features Allows user to compare papers side by side before choosing to view the whole document. (slide menu and hover functionality) Result filtering: Author- List, click or search Keyword- Click, cloud or search Title- Cloud or search Easy to use and visually appealing Dynamic/intuitive interface Menus scroll with user PDF files open in a new tab (easily forgotten in other systems) Not built over a relational database.
So how does it work? XML (Extensible Markup Language) Tag based data storage format To allow one to describe a document in terms of its structure, rather than its page layout Used PHP’s Document Object Model (DOM) library ‘context-rich’ Tags can be defined to specify both element names and attributes. In essence, the tags describe a flat-file database
So how does it work? PHP Widely used scripting language that can be embedded into HTML Dynamic content using static file Reading demonstrated how to use the DOM Library to parse XML. Runs on the web server and can run on most OS’s Free download: Wamp Server running on PC Includes Apache, MySQL and PHP5 for windows
Searching the Web Arvind Arasu Junghoo Cho Hector Garcia-Molina Andreas Paeckpe Sriram Raghavan An overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these components are presented. For this presentation we draw from the literature and from our own experimental search engine testbed. Emphasis is on introducing the fundamental concepts and the results of several performance analyses we conducted to compare different designs. Authorities, crawling, HITS, indexing, information retrieval, link analysis, PageRank, search engine …
Summary Process 3 major iterations 1.0- Basic display & browsing of all documents (10/page) 2.0- Basic display s.t. keywords were active filtering links 3.0 Active author, and keyword links Floating side menus – list of authors and paper frequency, # to display Basic search (by title, author or keyword) functionality Cloud of most common words. Speed Bumps Dealing with non-alphanumeric ASCII characters Figuring out the most appropriate term search algorithm Version compatibility issues -> wamp server Positive Gained valuable experience using popular web application development tools. Had fun designing! Research sparked interest in semantic web
Future Plans Add tag assignment functionality. Integrate the system into the Semantic Web using RDF. Expand upon filtering functionality Modularize code for use in other CS courses
Sources XML and PHP xmldomphp/ xmldomphp/ Uses.html Uses.html PHP/Apache/MySQL PC download Jquery