Presentation is loading. Please wait.

Presentation is loading. Please wait.

C. Lee Giles David Reese Professor, College of Information Sciences and Technology Graduate Professor of Computer Science and Engineering Courtesy Professor.

Similar presentations


Presentation on theme: "C. Lee Giles David Reese Professor, College of Information Sciences and Technology Graduate Professor of Computer Science and Engineering Courtesy Professor."— Presentation transcript:

1 C. Lee Giles David Reese Professor, College of Information Sciences and Technology Graduate Professor of Computer Science and Engineering Courtesy Professor of Supply Chain and Information Systems The Pennsylvania State University, University Park, PA, USA giles@ist.psu.edu http://clgiles.ist.psu.edu/IST441 Information Retrieval and Search Engines IST 441 Introduction to course and search engines

2 Course homepage Everything you need to know about the course –http://clgiles.ist.psu.edu/IST441http://clgiles.ist.psu.edu/IST441 –Or put IST441 into Google Project Exercises Readings Schedule Participation Exam

3 Professor C. Lee Giles Intelligent and specialty search engines; cyberinfrastructure for science, academia and government; big data –Modular, scalable, robust, automatic science and technology focused cyberinfrastructure and search engine creation and maintenance –Large heterogeneous data and information systems –Specialty science and technology search engines for knowledge discovery & integration CiteSeer x (all scholarly documents – focus on computer science) Chem X Seer (e-chemistry portal) CollabSeer (collaboration search) CSSeer (expert finding) Scalable intelligent tools/agents/methods/algorithms –Information, knowledge and data integration –Information and metadata extraction; entity recognition –Chemical formulae & names, tables, and figures –Unique search, knowledge discovery, information integration, data mining algorithms –Expert and collaboration recommendation –Research evaluation http://clgiles.ist.psu.edu

4 What will be covered What is information –How much is there? Properties of text –Documents models Information retrieval (IR) systems and methods –Query structures –Evaluation and Relevance –Role of the user –Vector models –Inverted index

5 What will be covered Search engines as IR systems and how they work –Indexers –Crawlers –Ranking –Evaluation –SEO Internet and Web –Web structure Semantic search Google and link analysis Social networks

6 Approach Readings and Lectures –Exercises –One exam –Participation Projects –Build 2 specialty search engines for a customer Customer defines the project –Built with Nutch, YouSeer, Lucid Works (based on Solr/Lucene)Solr/Lucene –Who uses LuceneLucene –Build a Google Custom Search Engine »Comparison of these two –Customer receives (reviews) search engine at the end of the semester –Presentation on search engines built –Report on search engine due at end of semester Undergrads vs grads Guest seminars

7 manyeyes visualization

8

9 Search gains on email July 2008 Pew Internet StudyPew Internet manyeyes visualization

10 Web Search Engine Use and Commerce Continues to Grow Pew Internet & American Life Internet Project Survey: Sept, 2005 - Search Engine News: Search engine advertising revenues exceed TV networks Walmart and other retailers express concern over Google FOG replaces FOM http://www.pewinternet.org

11 Web Search Engine Use and Commerce Continues to Grow Pew Internet & American Life Internet Project Survey: August, 2008 PewInternet

12 Web search engine use has new activities Pew Internet & American Life Internet Project Survey: 2009 PewInternet

13 Web Search Engine Use and Commerce Continues to Grow http://www.pewinternet.org

14 Search Engine Market Share http://www.pewinternet.org

15 Search Engine Market Share - US seoconsultants

16 2009 Search Engine Market Share - US seoconsultants

17 Search Engine Market Share - US

18

19

20 Search Engine Market Share

21

22 Dec 20122 billion internet users

23 Marketshare Search engine market share seems to be debatable ComScoreComScore global share

24 ComScoreComScore global share Number of search engine queries - US About 500M per day

25

26 2012

27 ComScoreComScore global share

28 ComScoreComScore global share Number of search engine queries - US About billion per day

29 Students who took this course Google Yahoo Microsoft Facebook RIT IBM Tencent Klout eBay Raytheon Lockheed Martin …

30

31


Download ppt "C. Lee Giles David Reese Professor, College of Information Sciences and Technology Graduate Professor of Computer Science and Engineering Courtesy Professor."

Similar presentations


Ads by Google