Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014.

Slides:



Advertisements
Similar presentations
Write a survey paper on security
Advertisements

1 Alexander Gelbukh Moscow, Russia. 2 Mexico 3 Computing Research Center (CIC), Mexico.
CSS446 Spring 2014 Nan Wang. 2 Instructor Instructors: –Nan Wang Office: TEC 232 Phone: (601) Meeting time and location:
CMPT 275 Software Engineering
WEB MINING. Why IR ? Research & Fun
Web Search and Mining Course Overview 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 0: Course Overview.
An Introduction to Information Retrieval and Applications J. H. Wang Feb. 19, 2008.
Web Search – Summer Term 2006 I. General Introduction (c) Wolfgang Hürst, Albert-Ludwigs-University.
Information Retrieval - Organization of the course Jian-Yun Nie 聂建云.
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan Sep. 16, 2005.
Introduction to Operating Systems J. H. Wang Sep. 18, 2012.
1 Web Search and Advanced Internet Services 290N Class Introduction Tao Yang, 2014.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 Information Retrieval and Advanced Internet Services 290N Class Introduction Tao Yang, 2015
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Introduction to Information Security J. H. Wang Sep. 15, 2014.
Introduction to Network Security J. H. Wang Feb. 24, 2011.
Information Retrieval CENG 555 Spring Course Web Page Authoritative source of administrivia In-class announcements generally reflected on Web.
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2014) Instructor: ChengXiang (“Cheng”) Zhai 1 Teaching Assistants: Xueqing Liu, Yinan Zhang.
Object Oriented Programming (OOP) Design Lecture 1 : Course Overview Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang.
Object Oriented Programming (OOP) Design Lecture 1 : Course Overview Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang.
Introduction to Discrete Mathematics J. H. Wang Sep. 14, 2010.
Introduction to Operating Systems J. H. Wang Sep. 18, 2015.
Department of Computer Science and Information Engineering National Taiwan Normal University Multimedia System Design Spring 2012 Mei-Chen Yeh 2011/02/21.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
Course Overview for Web Computing J. H. Wang Sep. 19, 2011.
Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Apr. 24, 2013.
Proposal for Term Project J. H. Wang Mar. 2, 2015.
Autumn Web Information retrieval (Web IR) Handout #0: Introduction Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Introduction to Information Security J. H. Wang Sep. 10, 2013.
Object Oriented Programming (OOP) Design Lecture 1 : Course Overview Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang.
Object Oriented Programming (FIT-II) J. H. Wang Feb. 20, 2009.
IR Homework #1 By J. H. Wang Mar. 21, Programming Exercise #1: Vector Space Retrieval Goal: to build an inverted index for a text collection, and.
Introduction to Operating Systems J. H. Wang Sep. 15, 2010.
Introduction to Computer Programming (FIT-I pro) J. H. Wang Sep. 17, 2007.
Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
Introduction to Information Security J. H. Wang Sep. 18, 2012.
Course Overview for Compilers J. H. Wang Sep. 14, 2015.
Object Oriented Programming (FIT-II) J. H. Wang Jan. 31, 2008.
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea.
Information Retrieval
ITIS 4510/5510 Web Mining Spring Overview Class hour 5:00 – 6:15pm, Tuesday & Thursday, Woodward Hall 135 Office hour 3:00 – 5:00pm, Tuesday, Woodward.
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
Course Overview for Compilers J. H. Wang Sep. 20, 2011.
Introduction to Operating Systems J. H. Wang Sep. 13, 2013.
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
1 Advanced Database System Design Instructor: Ruoming Jin Fall 2010.
Information Retrieval CIS-462 Dr. Samir Tartir 2013/2014 First Semester.
Course Overview: Linear Algebra
Course Overview 1 MAT 279 Data Communication and the Internet Prof. Shamik Sengupta Office 4210 N
Term Project Proposal By J. H. Wang Apr. 7, 2017.
Introduction to Operating Systems
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
CS6501 Advanced Topics in Information Retrieval Course Policy
Introduction to Information Security
Proposal for Term Project
Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2017.
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2016)
Introduction to Operating Systems
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Information Retrieval Systems
Information Retrieval and Extraction
Linear Algebra Berlin Chen
CSCE 4143 Section 001: Data Mining Spring 2019.
Information Retrieval CIS-462
Web Search and Advanced Internet Services
ADVANCED TOPICS IN INFORMATION RETRIEVAL AND WEB SEARCH
Presentation transcript:

Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 17, 2014

IR, Spring 2014NTUT CSIE2 Instructor & TA Instructor –J. H. Wang ( ) –Associate Professor, CSIE, NTUT –Office: R1534, Technology Building – –Tel: ext –Office Hour: 9:00-12:00 am, every Tuesday and Thursday TA –Mr. Huang (R1424, Technology Building) Available Time: Mon. morning or Tue. Afternoon gmail.com

IR, Spring 2014NTUT CSIE3 Course Description Course Web Page: for the latest announcements and updates of schedule, slides, and homeworks – Time: 9:10-12:00am, Fri. Classroom: R334, Technology Building Textbook: –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, Introduction to Information Retrieval Available online International Student Edition, imported by Kai-Fa ( ) Publishing Prerequisites: –Basic knowledge of data structures and algorithms, linear algebra, and probability theory –Programming experience is *required* for homeworks & projects

Target Audience Seniors Graduate students IGPEECS (International Graduate Program in Electrical Engineering and Computer Science) IR, Spring 2014NTUT CSIE4

IR, Spring 2014NTUT CSIE5 Additional References References: –Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, Modern Information Retrieval: The Concepts and Technology behind Search This is the second edition of their book Modern Information Retrieval in ( )Modern Information Retrieval –Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, ( ) Search Engines: Information Retrieval in Practice –Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.Information Retrieval: Implementing and Evaluating Search Engines

IR, Spring 2014NTUT CSIE6 More Books on IR Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, – Two classics, but out-of-print. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.Information Retrieval – The classic. More than 40 years old, but still worth reading. K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.Readings in Information Retrieval – A collection of classical IR papers. (out of print) I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, Managing Gigabytes – The authority on index construction and compression.

IR, Spring 2014NTUT CSIE7 Grading Policy Homework assignments and programming exercises: ~40% Mid-term exam: ~25% Term project: ~35% –Including proposal, presentation, and final report

IR, Spring 2014NTUT CSIE8 Programming Exercises and Term Project About 3 programming exercises –Team-based with maximum number of students per team: 4 for undergraduates 2 for graduate students –You can either write your own code or reuse existing open source code The term project –Either team-based system development (the same as programming exercises) –Or academic paper presentation Only one person per team allowed –A proposal is *required* before midterm (Apr. 11, 2014)

IR, Spring 2014NTUT CSIE9 About the Term Project The score you get depends on the functions, difficulty and quality of your project –For system development: System functions and correctness –For academic paper presentation Quality and your presentation of the paper Major methods/experimental results *must* be presented Papers from top conferences are strongly suggested –E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, … Proposals are *required* for each team, and will be counted in the score

IR, Spring 2014NTUT CSIE10 Online Submission Submission instructions –Programs, project proposals, and project reports in electronic files must be submitted to the TA online at: Submissions website: Submission instructions: –FTP server: localhost –User name & password: Your student ID

IR, Spring 2014NTUT CSIE11 What this Course is NOT about This course will NOT tell you –The tips and tricks of using search engines, although power users might have better ideas on how to improve them Therere plenty of books and websites on that… –How to find books in libraries, although its somewhat related to the basic IR concepts –How to make money on the Web, although the currently largest search engine did it

Whats Information Retrieval? Things that you have been doing all day! –Searching for something interesting: Web, news, , image, video, … –Asking for advices –… User interests are changing all the time… –2011: New Zealand Earthquake –2012: Jeremy Lin –2013: Meteor Russia –2014: ? (next slide) IR, Spring 2014NTUT CSIE12

IR, Spring 2014NTUT CSIE13 Whats Information Retrieval

In Google News IR, Spring 2014NTUT CSIE14

IR, Spring 2014NTUT CSIE15

In Web Pages IR, Spring 2014NTUT CSIE16

IR, Spring 2014NTUT CSIE17 In Wikipedia

In Google Images IR, Spring 2014NTUT CSIE18

Different keywords: Ukraine riots IR, Spring 2014NTUT CSIE19

IR, Spring 2014NTUT CSIE20

More related keywords IR, Spring 2014NTUT CSIE21

IR, Spring 2014NTUT CSIE22

What if We Search in Chinese IR, Spring 2014NTUT CSIE23

IR, Spring 2014NTUT CSIE24

IR, Spring 2014NTUT CSIE25

IR, Spring 2014NTUT CSIE26

IR, Spring 2014NTUT CSIE27

IR, Spring 2014NTUT CSIE28 Related Keywords Ukraine Ukraine riots Ukraine crisis Kiev Protest Truce 2014 Hrushevskoho Street riots …

IR, Spring 2014NTUT CSIE29 Related Keywords in Chinese … And this can go on: –for other languages… –and other search engines… –and social websites…

IR, Spring 2014NTUT CSIE30 In Google Trends

IR, Spring 2014NTUT CSIE31

IR, Spring 2014NTUT CSIE32

IR, Spring 2014NTUT CSIE33 And Social Search…

How do I Know What People Care about? IR, Spring 2014NTUT CSIE34

What are People Searching in Taiwan on that day? IR, Spring 2014NTUT CSIE35

IR, Spring 2014NTUT CSIE36 What Is Information Retrieval? Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information. (Salton, 1968)

IR, Spring 2014NTUT CSIE37 Goal Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR

IR, Spring 2014NTUT CSIE38 A Big Picture

IR, Spring 2014NTUT CSIE39 Inverted Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representation logical view inverted file Document Collection

IR, Spring 2014NTUT CSIE40 Topics Text IR –Indexing and searching –Query languages and operations Retrieval evaluation Modeling –Boolean model –Vector space model –Probabilistic model Applications for IR –Multimedia IR –Web search –Digital libraries

IR, Spring 2014NTUT CSIE41 Organization of the Textbook Basics in IR (focus) –Inverted indexes for boolean queries (Ch.1-5) –Term weighting and vector space model (Ch. 6-7) –Evaluation in IR (Ch. 8) Advanced Topics –Relevance feedback (Ch. 9) –XML retrieval (Ch. 10) –Probabilistic IR (Ch. 11) –Language models (Ch. 12) Machine learning in IR (useful) –Text classification (Ch ) –Document clustering (Ch ) Web Search –Web crawling and indexes (Ch ) –Link analysis (Ch. 21)

Some Overlap with Other Fields Text mining, Information Extraction Machine Learning Natural Language Processing Social Network Analysis … IR, Spring 2014NTUT CSIE42

IR, Spring 2014NTUT CSIE43 Pointers to Other Topics Cross-language IR Image, video, and multimedia IR Speech retrieval Music retrieval User interfaces Parallel, distributed, and P2P IR Digital libraries Information science perspective Logic-based approaches to IR Natural language processing techniques …

IR, Spring 2014NTUT CSIE44 Tentative Schedule Before midterm –Boolean retrieval (1 wk) –Indexing (2 wks) –Vector space model and evaluation (2 wk) –Relevance feedback (1 wk) –Probabilistic IR (2 wk) After midterm –Text classification (1-2 wk) –Document clustering (1-2 wk) –Web search (2 wks) –Advanced topics: CLIR, IE, … (2 wks) –Term Project Presentation (3 wks)

IR, Spring 2014NTUT CSIE45 Generic Resources Wikipedia page on Information Retrieval: n_retrieval n_retrieval Information Retrieval Resources: csli.stanford.edu/~hinrich/information- retrieval.html csli.stanford.edu/~hinrich/information- retrieval.html

IR, Spring 2014NTUT CSIE46 Academic Resources Journals –ACM TOIS: Transactions on Information Systems –JASIST: Journal of the American Society of Information Sciences –IP&M: Information Processing and Management –IEEE TKDE: Transactions on Knowledge and Data Engineering Conferences –ACM SIGIR: International Conference on Information Retrieval –WWW: World Wide Web Conference –ACM CIKM: Conference on Information Knowledge and Management –JCDL: ACM/IEEE Joint Conference on Digital Libraries –ACM WSDM: International Conference on Web Search and Data Mining –TREC: Text Retrieval Conference

Teaching in English… Slides and lectures will be offered mainly in English For better understanding for domestic students, important concepts will be briefly summarized in Chinese IR, Spring 2014NTUT CSIE47

IR, Spring 2014NTUT CSIE48 Thanks for Your Attention! Any question or comment? Please feel free to send s to or discuss with me at my office