Presentation is loading. Please wait.

Presentation is loading. Please wait.

Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012.

Similar presentations


Presentation on theme: "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012."— Presentation transcript:

1 Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012

2 IR, Spring 2012NTUT CSIE2 Instructor & TA Instructor –J. H. Wang ( 王正豪 ) –Assistant Professor, CSIE, NTUT –Office: R1534, Technology Building –E-mail: jhwang@csie.ntut.edu.twjhwang@csie.ntut.edu.tw –Tel: ext. 4238 –Office Hour: 9:00-12:00 am, every Tuesday and Wednesday TA –Mr. Liu ( 劉瀚之 ) –R1424, Technology Building

3 IR, Spring 2012NTUT CSIE3 Course Description Course Web Page –http://www.ntut.edu.tw/~jhwang/IR/http://www.ntut.edu.tw/~jhwang/IR/ Time: 9:10-12:00am, Thu. Classroom: R1322, Technology Building Textbook: –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, 2008. Introduction to Information Retrieval Available online International Student Edition, imported by Kai-Fa ( 開發 ) Publishing Prerequisites: –Basic knowledge of data structures and algorithms, linear algebra, and probability theory –Programming experience is *required* for homeworks & projects

4 IR, Spring 2012NTUT CSIE4 Additional References References: –Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, 2011. Modern Information Retrieval: The Concepts and Technology behind Search This is the second edition of their book Modern Information Retrieval in 1999. ( 華通 )Modern Information Retrieval –Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.Information Retrieval: Implementing and Evaluating Search Engines –Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, 2010. ( 全華 ) Search Engines: Information Retrieval in Practice

5 IR, Spring 2012NTUT CSIE5 More Books on IR Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, 1968. Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983. – Two classics, but out-of-print. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.Information Retrieval – The classic. More than 40 years old, but still worth reading. K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.Readings in Information Retrieval – A collection of classical IR papers. (out of print) I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, 1999. Managing Gigabytes – The authority on index construction and compression.

6 IR, Spring 2012NTUT CSIE6 Grading Policy Homework assignments and programming exercises: 40% Mid-term exam: 25% Term project: 35% –Including the proposal and final report

7 IR, Spring 2012NTUT CSIE7 Programming Exercises and Term Project About 3 programming exercises –Team-based (at most 2 persons per team) –You can either write your own code or reuse existing open source code The term project –Either team-based system development (the same as programming exercises) –Or academic paper presentation Only one person per team allowed –A proposal is required before midterm (Apr. 12, 2012)

8 IR, Spring 2012NTUT CSIE8 About the Term Project The score you get depends on the difficulty and quality of your project –For system development: System functions and correctness –For academic paper presentation Quality and your presentation of the paper Major methods/experimental results *must* be presented Papers from top conferences are strongly suggested –E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, … Proposals are *required* for each team, and will counted in the score

9 IR, Spring 2012NTUT CSIE9 Online Submission Submission instructions –Programs, project proposals, and project reports in electronic files must be submitted to the TA online at: http://140.124.183.39/ir/ –Before submission: User name: Your student ID Please change your default password at your first login

10 IR, Spring 2012NTUT CSIE10 What this Course is NOT about This course will NOT tell you –The tips and tricks of using search engines, although power users might have better ideas on how to improve them There’re plenty of books and websites on that… –How to find books in libraries, although it’s somewhat related to the basic IR concepts –How to make money on the Web, although the currently largest search engine did it

11 IR, Spring 2012NTUT CSIE11 What’s Information Retrieval

12 IR, Spring 2012NTUT CSIE12 On Wikipedia

13 IR, Spring 2012NTUT CSIE13 On Google Images

14 IR, Spring 2012NTUT CSIE14 On Google Video Search

15 IR, Spring 2012NTUT CSIE15 On Google News (TW)

16 IR, Spring 2012NTUT CSIE16 On Google News (US)

17 IR, Spring 2012NTUT CSIE17 On Blogs

18 IR, Spring 2012NTUT CSIE18 On Google Translate…

19 IR, Spring 2012NTUT CSIE19 Or More Related Keywords NBA New York Knicks Linsanity …

20 IR, Spring 2012NTUT CSIE20 What if We Search in Chinese

21 IR, Spring 2012NTUT CSIE21 And More… 紐約尼克 哈佛 台裔球員 … And other languages… And other search engines… And social websites…

22 IR, Spring 2012NTUT CSIE22 In Google Trends

23 IR, Spring 2012NTUT CSIE23 And More…

24 IR, Spring 2012NTUT CSIE24 And Other Keywords…

25 IR, Spring 2012NTUT CSIE25 And Other Keywords…

26 IR, Spring 2012NTUT CSIE26 Palanteer – TW Election

27 IR, Spring 2012NTUT CSIE27

28 IR, Spring 2012NTUT CSIE28

29 IR, Spring 2012NTUT CSIE29 What Is Information Retrieval? “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)

30 IR, Spring 2012NTUT CSIE30 Goal Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR

31 IR, Spring 2012NTUT CSIE31 A Big Picture

32 IR, Spring 2012NTUT CSIE32 Inverte d Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representation logical view inverted file Document Collection

33 IR, Spring 2012NTUT CSIE33 Topics Text IR –Indexing and searching –Query languages and operations Retrieval evaluation Modeling –Boolean model –Vector space model –Probabilistic model Applications for IR –Multimedia IR –Web search –Digital libraries

34 IR, Spring 2012NTUT CSIE34 Organization of the Textbook Basics in IR (focus) –Inverted indexes for boolean queries (Ch.1-5) –Term weighting and vector space model (Ch. 6-7) –Evaluation in IR (Ch. 8) Advanced Topics –Relevance feedback (Ch. 9) –XML retrieval (Ch. 10) –Probabilistic IR (Ch. 11) –Language models (Ch. 12) Machine learning in IR (useful) –Text classification (Ch. 13-15) –Document clustering (Ch. 16-18) Web Search –Web crawling and indexes (Ch. 19-20) –Link analysis (Ch. 21)

35 IR, Spring 2012NTUT CSIE35 Pointers to Other Topics Cross-language IR Image, video, and multimedia IR Speech retrieval Music retrieval User interfaces Parallel, distributed, and P2P IR Digital libraries Information science perspective Logic-based approaches to IR Natural language processing techniques

36 IR, Spring 2012NTUT CSIE36 Tentative Schedule Before midterm –Boolean retrieval (1 wk) –Indexing (2 wks) –Vector space model and evaluation (2 wk) –Relevance feedback (1 wk) –Probabilistic IR (2 wk) After midterm –Text classification (1-2 wk) –Document clustering (1-2 wk) –Web search (2 wks) –Advanced topics: CLIR, IE, … (2 wks) –Term Project Presentation (3 wks)

37 IR, Spring 2012NTUT CSIE37 Generic Resources Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Informatio n_retrieval http://en.wikipedia.org/wiki/Informatio n_retrieval Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html http://www- csli.stanford.edu/~hinrich/information- retrieval.html

38 IR, Spring 2012NTUT CSIE38 Academic Resources Journals –ACM TOIS: Transactions on Information Systems –JASIST: Journal of the American Society of Information Sciences –IP&M: Information Processing and Management –IEEE TKDE: Transactions on Knowledge and Data Engineering Conferences –ACM SIGIR: International Conference on Information Retrieval –WWW: World Wide Web Conference –ACM CIKM: Conference on Information Knowledge and Management –JCDL: ACM/IEEE Joint Conference on Digital Libraries –ACM WSDM: International Conference on Web Search and Data Mining –TREC: Text Retrieval Conference

39 IR, Spring 2012NTUT CSIE39 Thanks for Your Attention!


Download ppt "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012."

Similar presentations


Ads by Google