Presentation is loading. Please wait.

Presentation is loading. Please wait.

Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014.

Similar presentations


Presentation on theme: "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014."— Presentation transcript:

1 Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014

2 IR, Spring 2014NTUT CSIE2 Instructor & TA Instructor –J. H. Wang ( ) –Associate Professor, CSIE, NTUT –Office: R1534, Technology Building –E-mail: jhwang@csie.ntut.edu.twjhwang@csie.ntut.edu.tw –Tel: ext. 4238 –Office Hour: 9:00-12:00 am, every Tuesday and Thursday TA –Mr. Huang (R1424, Technology Building) Available Time: Mon. morning or Tue. Afternoon E-mail: jsn900211 @ gmail.com

3 IR, Spring 2014NTUT CSIE3 Course Description Course Web Page: for the latest announcements and updates of schedule, slides, and homeworks –http://www.ntut.edu.tw/~jhwang/IR/http://www.ntut.edu.tw/~jhwang/IR/ Time: 9:10-12:00am, Fri. Classroom: R334, Technology Building Textbook: –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, 2008. Introduction to Information Retrieval Available online International Student Edition, imported by Kai-Fa ( ) Publishing Prerequisites: –Basic knowledge of data structures and algorithms, linear algebra, and probability theory –Programming experience is *required* for homeworks & projects

4 Target Audience Seniors Graduate students IGPEECS (International Graduate Program in Electrical Engineering and Computer Science) IR, Spring 2014NTUT CSIE4

5 IR, Spring 2014NTUT CSIE5 Additional References References: –Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, 2011. Modern Information Retrieval: The Concepts and Technology behind Search This is the second edition of their book Modern Information Retrieval in 1999. ( )Modern Information Retrieval –Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, 2010. ( ) Search Engines: Information Retrieval in Practice –Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.Information Retrieval: Implementing and Evaluating Search Engines

6 IR, Spring 2014NTUT CSIE6 More Books on IR Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, 1968. Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983. – Two classics, but out-of-print. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.Information Retrieval – The classic. More than 40 years old, but still worth reading. K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.Readings in Information Retrieval – A collection of classical IR papers. (out of print) I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, 1999. Managing Gigabytes – The authority on index construction and compression.

7 IR, Spring 2014NTUT CSIE7 Grading Policy Homework assignments and programming exercises: ~40% Mid-term exam: ~25% Term project: ~35% –Including proposal, presentation, and final report

8 IR, Spring 2014NTUT CSIE8 Programming Exercises and Term Project About 3 programming exercises –Team-based with maximum number of students per team: 4 for undergraduates 2 for graduate students –You can either write your own code or reuse existing open source code The term project –Either team-based system development (the same as programming exercises) –Or academic paper presentation Only one person per team allowed –A proposal is *required* before midterm (Apr. 11, 2014)

9 IR, Spring 2014NTUT CSIE9 About the Term Project The score you get depends on the functions, difficulty and quality of your project –For system development: System functions and correctness –For academic paper presentation Quality and your presentation of the paper Major methods/experimental results *must* be presented Papers from top conferences are strongly suggested –E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, … Proposals are *required* for each team, and will be counted in the score

10 IR, Spring 2014NTUT CSIE10 Online Submission Submission instructions –Programs, project proposals, and project reports in electronic files must be submitted to the TA online at: Submissions website: http://140.124.183.31/net2ftp Submission instructions: –FTP server: localhost –User name & password: Your student ID

11 IR, Spring 2014NTUT CSIE11 What this Course is NOT about This course will NOT tell you –The tips and tricks of using search engines, although power users might have better ideas on how to improve them Therere plenty of books and websites on that… –How to find books in libraries, although its somewhat related to the basic IR concepts –How to make money on the Web, although the currently largest search engine did it

12 Whats Information Retrieval? Things that you have been doing all day! –Searching for something interesting: Web, news, e-mail, image, video, … –Asking for advices –… User interests are changing all the time… –2011: New Zealand Earthquake –2012: Jeremy Lin –2013: Meteor Russia –2014: ? (next slide) IR, Spring 2014NTUT CSIE12

13 IR, Spring 2014NTUT CSIE13 Whats Information Retrieval

14 In Google News IR, Spring 2014NTUT CSIE14

15 IR, Spring 2014NTUT CSIE15

16 In Web Pages IR, Spring 2014NTUT CSIE16

17 IR, Spring 2014NTUT CSIE17 In Wikipedia

18 In Google Images IR, Spring 2014NTUT CSIE18

19 Different keywords: Ukraine riots IR, Spring 2014NTUT CSIE19

20 IR, Spring 2014NTUT CSIE20

21 More related keywords IR, Spring 2014NTUT CSIE21

22 IR, Spring 2014NTUT CSIE22

23 What if We Search in Chinese IR, Spring 2014NTUT CSIE23

24 IR, Spring 2014NTUT CSIE24

25 IR, Spring 2014NTUT CSIE25

26 IR, Spring 2014NTUT CSIE26

27 IR, Spring 2014NTUT CSIE27

28 IR, Spring 2014NTUT CSIE28 Related Keywords Ukraine Ukraine riots Ukraine crisis Kiev Protest Truce 2014 Hrushevskoho Street riots …

29 IR, Spring 2014NTUT CSIE29 Related Keywords in Chinese … And this can go on: –for other languages… –and other search engines… –and social websites…

30 IR, Spring 2014NTUT CSIE30 In Google Trends

31 IR, Spring 2014NTUT CSIE31

32 IR, Spring 2014NTUT CSIE32

33 IR, Spring 2014NTUT CSIE33 And Social Search…

34 How do I Know What People Care about? IR, Spring 2014NTUT CSIE34

35 What are People Searching in Taiwan on that day? IR, Spring 2014NTUT CSIE35

36 IR, Spring 2014NTUT CSIE36 What Is Information Retrieval? Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information. (Salton, 1968)

37 IR, Spring 2014NTUT CSIE37 Goal Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR

38 IR, Spring 2014NTUT CSIE38 A Big Picture

39 IR, Spring 2014NTUT CSIE39 Inverted Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representation logical view inverted file Document Collection

40 IR, Spring 2014NTUT CSIE40 Topics Text IR –Indexing and searching –Query languages and operations Retrieval evaluation Modeling –Boolean model –Vector space model –Probabilistic model Applications for IR –Multimedia IR –Web search –Digital libraries

41 IR, Spring 2014NTUT CSIE41 Organization of the Textbook Basics in IR (focus) –Inverted indexes for boolean queries (Ch.1-5) –Term weighting and vector space model (Ch. 6-7) –Evaluation in IR (Ch. 8) Advanced Topics –Relevance feedback (Ch. 9) –XML retrieval (Ch. 10) –Probabilistic IR (Ch. 11) –Language models (Ch. 12) Machine learning in IR (useful) –Text classification (Ch. 13-15) –Document clustering (Ch. 16-18) Web Search –Web crawling and indexes (Ch. 19-20) –Link analysis (Ch. 21)

42 Some Overlap with Other Fields Text mining, Information Extraction Machine Learning Natural Language Processing Social Network Analysis … IR, Spring 2014NTUT CSIE42

43 IR, Spring 2014NTUT CSIE43 Pointers to Other Topics Cross-language IR Image, video, and multimedia IR Speech retrieval Music retrieval User interfaces Parallel, distributed, and P2P IR Digital libraries Information science perspective Logic-based approaches to IR Natural language processing techniques …

44 IR, Spring 2014NTUT CSIE44 Tentative Schedule Before midterm –Boolean retrieval (1 wk) –Indexing (2 wks) –Vector space model and evaluation (2 wk) –Relevance feedback (1 wk) –Probabilistic IR (2 wk) After midterm –Text classification (1-2 wk) –Document clustering (1-2 wk) –Web search (2 wks) –Advanced topics: CLIR, IE, … (2 wks) –Term Project Presentation (3 wks)

45 IR, Spring 2014NTUT CSIE45 Generic Resources Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Informatio n_retrieval http://en.wikipedia.org/wiki/Informatio n_retrieval Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html http://www- csli.stanford.edu/~hinrich/information- retrieval.html

46 IR, Spring 2014NTUT CSIE46 Academic Resources Journals –ACM TOIS: Transactions on Information Systems –JASIST: Journal of the American Society of Information Sciences –IP&M: Information Processing and Management –IEEE TKDE: Transactions on Knowledge and Data Engineering Conferences –ACM SIGIR: International Conference on Information Retrieval –WWW: World Wide Web Conference –ACM CIKM: Conference on Information Knowledge and Management –JCDL: ACM/IEEE Joint Conference on Digital Libraries –ACM WSDM: International Conference on Web Search and Data Mining –TREC: Text Retrieval Conference

47 Teaching in English… Slides and lectures will be offered mainly in English For better understanding for domestic students, important concepts will be briefly summarized in Chinese IR, Spring 2014NTUT CSIE47

48 IR, Spring 2014NTUT CSIE48 Thanks for Your Attention! Any question or comment? Please feel free to send e-mails to jhwang@csie.ntut.edu.tw or discuss with me at my office jhwang@csie.ntut.edu.tw


Download ppt "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014."

Similar presentations


Ads by Google