Presentation is loading. Please wait.

Presentation is loading. Please wait.

Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer.

Similar presentations


Presentation on theme: "Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer."— Presentation transcript:

1 Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer Science, University of Rome, La Sapienza ENEAs Usability Lab Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer Science, University of Rome, La Sapienza ENEAs Usability Lab

2 Summary: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Summary: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

3 Internet is the biggest and the most widespread network Millions of heterogeneous users Billions of information sources provided by Web Exponantial increasing of Web site count Increasing of network access by end users Increasing of web browser funtionalities Increasing of search engines performs 33 millions in the United States, 1 million and 300 thousand in Germany, 371 thousand in Italy The users that use Internet since more than 3 years are only the 11% Information Search in Internet 158 Milions of accesses in Junary 99 A forecast of 200 millions in 2000 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

4 Users problems related to information search: Many users dont know the Web information model Users have problems to find a valid tools able to locate the relevant information Users have problems to describe searched information using right and concise terms Users have problems to use advanced search tools (i.e. Site Explorer Server is more difficult to use rather than browser) Il problema della ricerca delle informazioni sul Web Issue: Information search in Internet could be a problem for particular type of users? Today a better scenario Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

5 Site Explorer v1.1 IR Implementation of a Client/Server tools able to make Web IR using Java, experimented and tested ENEA Tool integrated with browser Analisi dei requisiti dellutente Network service New search and exploration tools New and alternative Web approach to traditional browser Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

6 Information Retrieval Systems Struttura generale Gerard Salton, Introduction to modern information retrieval, Ed. 1983, McGraw-Hill, Inc. User Documents Data structure in pre-definded language Result Query Similar Indexing Query formulation by user Indexing process Result formulation Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

7 Information Retrieval Systems Formulazione della richiesta Query formulation is a list of terms able to express and summarize the searched argument Boolean Systems combine the terms using boolean operators: and or andnot Extended boolean systems use additional operators: nearness of terms cutting of terms search using particular field In Ranking systems query formulation is made using natural language phrases Operatori booleani Information and retrieval Information or retrieval Information andnot retrieval Examples: Operatori estesi Ranking Information adj retrieval Inform* Information [in titolo] Examples: Uman influence in Information Retrieval systems Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

8 Indexing is a process to analyse documents and to provide a short contents rapresentation. Data structure to contains document rapresentation A file where every record describe the releted record with each particular term Information Retrieval Systems Indicizzazione Rapresentation is based on a keyword vector. These keywords are choosen by a manual process or are extracted by an authomatic process Example: Information Retrieval Data Structure & Algorithms Example: List, tree, index file, etc. Terms vector Data structures Example: Iverted indexing Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

9 In traditional IRS the result is a potential relevant document list Explicit measure of relevance level (score) Information Retrieval Systems Formulazione e presentazione del risultato William B. Frakes, Ricardo Baeza- Yates, Information Retrieval Data Structure & Algorithms, Ed. 1992, Prentice Hall, Inc. Gerard Salton, Introduction to modern information retrieval, Ed. 1983, McGraw-Hill, Inc. Documents ordinated by relevance level Resuls order Dynamic presentation (results manipulation) Graphic and direct method presentations Use of windows (different way to present the results) Multimedia integration New features Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

10 Information Retrieval Systems Calcolo dello score Compute of a term weght for a document Term frequence in the document * term relevance weigth in the collection Key point in score compute Compute the score: Boolean system: use SOP method Ranking system: use particular formula. Information Retrieval Systems Calcolo dello score Score compute is focused to measure the relevance of specific terms in specific documents A method to weight the term relevance in the whole document collection Frequence normalization for particular document collection Example: (Sparck Jones, 1972) (Dennis, 1967) Example: (Croft, 1983) (Harman, 1986) Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

11 I motori di ricerca Web interface (Query and results) Index DB Authomatic indexing system SIMILAR Web pages New functionality in the most popular search engine: Sites classification Integration of new advanced search services to search information in particular format (picture, sounds, MP3, etc.) not much search engines provide a document score Migration from search service to on-line seller guides Media Matrix - June 1999 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

12 Source: FIND/ITPD, III, Gennaio NII project, supported by DOIT, MOEA Internet Da trentanni ad oggi 30 years 1969 First transmission on ARPANET 1978 ufficialization of TCP/IP 1991 World Wide Web 1992 ISOC 1983 NSFNet 1999 Inet99 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

13 30 years 1969 First transmission on ARPANET 1978 ufficialization of TCP/IP 1991 World Wide Web 1992 ISOC 1983 NSFNet 1999 Inet99 Source: FIND/ITPD, III, Gennaio NII project, supported by DOIT, MOEA Internet Da trentanni ad oggi Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

14 Source: FIND/ITPD, III, Gennaio NII project, supported by DOIT, MOEA Internet Da trentanni ad oggi 30 years 1969 First transmission on ARPANET 1978 ufficialization of TCP/IP 1991 World Wide Web 1992 ISOC 1983 NSFNet 1999 Inet99 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

15 Source: FIND/ITPD, III, Gennaio NII project, supported by DOIT, MOEA Internet Da trentanni ad oggi 30 years 1969 First transmission on ARPANET 1978 ufficialization of TCP/IP 1991 World Wide Web 1992 ISOC 1983 NSFNet 1999 Inet99 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

16 Future Tracks: Research and technologies Educational The Public Administration E-commerce Internet Verso il domani Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

17 Site Explorer Server v2.0 Java Main features Multithread Object-oriented Dynamic Portable Platform independence Technologies Applet High functionalities for networking Oriented to Graphic User Interfaces implementation Oriented to Client/Server systems implementation Client Server Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

18 Site Explorer Server v2.0 Obiettivi able to work directly on Web able to helps the user to find interesting documents on Web Goals - To implement a new system: able to integrate: search functions alternative approach rather than browser management functions user position to access to the Web etherogeneous data using a unique way. with an high usability degree Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

19 Site Explorer Server v.2.0. AClient/Server system, implemented using Java, able to make automatic Web site analyse, and to provide, as result, the tree site structure where the root node represents the site home-page. Focused on information search and retreiving by keywords search approach an easy information-filtering service a score computation service user management Site Explorer Server v2.0 INTERNET Web site User Interface Site Explorer Server A network service An accessible (open to everybody) open and multi-platform service Additional features Client Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

20 Internet Web site #1 Web site #2 Web site #n HTTP SEP HTTP Server (SES) SEC Browser(SEJA) SEJA applet Windows User 2 Unix User 3 Mac-OS User mUser 1 Client/Server system The Server (SES) is a Java application The Client (SEJA) is a Java applet SES and SEJA speak using a dedicated Application layer protocol (SEP) Site Explorer Server v2.0 Architettura esterna e configurazione Technical features Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

21 Query selector process HTTP connection process Links extraction process Contents extraction process Keywords analisys process Score process Result builder Result-display process Web sites USER Site Explorer Server v2.0 Funzionamento e processi Client user interface Next sites page Query Result Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

22 Site Explorer Server v2.0 Main Connection request (client) ComunicatorFunction manager RetrieverUser manager Internet Query (client) Results (client) Site analyser Page analyser full-text document analyse Links cheking using connection requests HTML 4 oriented Site Explorer Server v2.0 Sottocomponenti del SES Features Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

23 Three score level: Level 1 score. Its based only on the keywords items inside the Web page. Level 2 score. Its also based on the keywords distribution inside the whole Web site. Level 3 score. Its based also on the position of keywords items inside the Web page structure. Site Explorer Server v2.0 Lo score di Site Explorer Server Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

24 Site Explorer Server v2.0 Site Explorer Java Applet GUI Tree structure area Retrieved object in Web site Multimedia area Textual area Displayed result Menù-bar Tool-bar State indicator State bar

25 Site Explorer Server v2.0 Site Explorer Java Applet GUI

26 Site Explorer Server v2.0 Site exploration Connessione al server

27 Site Explorer Server v2.0 Site exploration Indicatore di connessine attiva

28 Site Explorer Server v2.0 Site exploration New site analyse request

29 Site Explorer Server v2.0 Site exploration Use of a favorite site analyse request

30 Site Explorer Server v2.0 Site exploration Use of a pre-defined site analyse request

31 Site Explorer Server v2.0 Site exploration Receiving result

32 Site Explorer Server v2.0 Site exploration Results navigation Relevat page indicator Score level

33 Site Explorer Server v2.0 Site exploration Results browsing

34 Site Explorer Server v2.0 Site exploration

35 Site Explorer Server v2.0 Il pilot-center Lo Usability Lab (Ulab), istituito nel 1992 presso il pilot- center del progetto ESPRIT III VENUS e svolge unattività di Ricerca & Sviluppo nel campo delle interfacce visuali avanzate a basi di dati e sistemi informativi multimediali in rete. Macchine di sviluppo e test: Intel Pentium II 350Mhz / Windows 98 (Netlab) Intel Pentium MMX 166Mhz / Windows 95 (Fontanaulab) AMD K6 300Mhz/ Windows 98 (Ulab) Sun Sparc Station 5 / Unix Solaris 2.5 (Venus) Sun Sparc Station 10 / Unix Solaris 2.5 (Dafne) Strumenti software: JDK v1.1.6, JDK v1.1.7, JDK v1.1.7a, JDK v1.17b, JDK Edit+, Netbeans Java Swing v1.0.3, Java Media Framework v1.1 Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

36 A strong system good/exellent usability degree A good response time (Analyse and result build) Site Explorer Server v2.0 Conclusion and experimental results 50 users selected using ENEA/VENUS methodology: random user. Occassional system use. Professional users: System user related to their work. Expert user. Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

37 Site Explorer Server v2.0 ENEA applications G7 Global-Inventory project A project data card collection Site search engine vs Site Explorer Server G7 Global-Inventory project A project data card collection Site search engine vs Site Explorer Server Plus - Prosoma LinkUp Service A multimedia data card collection Plus - Prosoma LinkUp Service A multimedia data card collection Experimental sites: ULAB sites Experimental sites: ULAB sites Future testing: Virtual Lab Site FAD Future testing: Virtual Lab Site FAD Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

38 Site Explorer Server v2.0 e altri sistemi esistenti LinkBot - Analisi dei link Site Explorer - Costruzione di un albero per un singolo sito SurfMap JavaNavigator PersonalSearch: applet come motore di ricerca per un sito Virgilio - Funzione di ricerca su un sito MerzeScope: applet di navigazione su un grafo con funzione di ricerca per un solo sito Esplorazione dei link Applet per navigazione su mappa Ricerca su un sito Navigazione su mappa e funzione di ricerca HyperSystem Net40 - esplora un sito e ne da una rappresentazione ad albero permettendo la navigazione Esplorazione e rappresentazio ne di un sito Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works

39 The insertion of a new system agent able to make automatic off-line Web site analysis to suggest to the user, using his profile information, a set of query about specific themes. Site Explorer Server v2.0 Future works A totally modular internal architecture to be able to add new modules and news functions in the simplest and most dynamic way. The implementation of a user profile system based on the users interests constantly updateable by a feed-back technique. Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works Index: Introduction Information Retrieval Systems and keyword score Search engines Internet now and the future Java Site Explorer Server v2.0 Conclusion and experimental results Future works


Download ppt "Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer."

Similar presentations


Ads by Google