Presentation is loading. Please wait.

Presentation is loading. Please wait.

Site-wide Search Upgrade and new features Jon Warbrick University of Cambridge Computing Service

Similar presentations


Presentation on theme: "Site-wide Search Upgrade and new features Jon Warbrick University of Cambridge Computing Service"— Presentation transcript:

1 Site-wide Search Upgrade and new features Jon Warbrick University of Cambridge Computing Service jw35@cam.ac.uk

2 Site-wide search web-search.cam.ac.uk

3 Site-wide search web-search.cam.ac.uk Ultraseek, from Infoseek

4 Site-wide search web-search.cam.ac.uk Ultraseek, from Infoseek -> Inktomi

5 Site-wide search web-search.cam.ac.uk Ultraseek, from Infoseek -> Inktomi -> Verity

6 Site-wide search web-search.cam.ac.uk Ultraseek, from Infoseek -> Inktomi -> Verity -> Autonomy

7 Site-wide search web-search.cam.ac.uk Ultraseek, from Infoseek -> Inktomi -> Verity -> Autonomy Currently indexing – ~600 servers – ~1.2 million documents – ~2.5 million URLs

8 Site-wide search Indexes 'more-or-less official' servers

9 Site-wide search Indexes 'more-or-less official' servers Maintains two indexes – 'internal' and 'external' – automatically routes queries

10 Site-wide search Indexes 'more-or-less official' servers Maintains two indexes – 'internal' and 'external' – automatically routes queries Services for University Webmasters – Add/delete/re-index – Packaged searches

11 2006 Upgrade Improved resilience

12 2006 Upgrade Improved resilience Case-inSenSITIVE matching

13 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links

14 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links

15 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links Passage-based summaries

16 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links Passage-based summaries

17 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links Passage-based summaries Grouping by location

18 2006 Upgrade Improved resilience Case-insensitive matching Quick Links Passage-based summaries Grouping by location

19 2006 Upgrade Improved resilience Case-inSenSITIVE matching Quick Links Passage-based summaries Grouping by location [ All terms matching ]

20 2006 Upgrade More indexing (dynamic pages + https + JavaScript)

21 2006 Upgrade More indexing (dynamic pages + https + JavaScript)

22 2006 Upgrade More indexing (dynamic pages + https + JavaScript)

23 2006 Upgrade More indexing (dynamic pages + https + JavaScript) Sources of indexing requests – s1.web-search.cam.ac.uk - s6.web-search.cam.ac.uk – an address in the range 192.153.213.0-255

24 2006 Upgrade More indexing (dynamic pages + https + JavaScript) Sources of indexing requests – s1.web-search.cam.ac.uk - s6.web-search.cam.ac.uk – an address in the range 192.153.213.0-255 Backup search engines – Add URL, Revisit Site, etc.

25 Problems with dynamic content

26 Randomly permuted query arguments Gratuitously-varying detail Variant pages Calendars linking to other pages Cache-busting headers Frames hiding real URL Junk path info 'Success' error pages Lack of Last Modification time stamp Inconsistent URLs

27 Further information Notes for webmasters: http://www.cam.ac.uk/cs/web-search/ Details of recent changes: http://www.cam.ac.uk/cs/web-search/changes-200608.html Help and advice: web-support@ucs.cam.ac.uk

28 If you have been, thanks for listening

29 I wonder if anyone will ask...

30 Why don't you use Google ?


Download ppt "Site-wide Search Upgrade and new features Jon Warbrick University of Cambridge Computing Service"

Similar presentations


Ads by Google