Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modern Information Retrieval Chapter 4 Query Languages.

Similar presentations


Presentation on theme: "Modern Information Retrieval Chapter 4 Query Languages."— Presentation transcript:

1 Modern Information Retrieval Chapter 4 Query Languages

2 The type of query the user might formulate is largely dependent on the underlying information retrieval model

3 Keyword-based querying single-word queries  text documents are long sequences of words  ranking of results by term frequency and inverse document frequency  exact positions where the query word appears may need to be output

4 context queries  to search words near other words  phrase query: a sequence of single-words enhance retrieval  proximity query: a sequence of single-words with a maximum allowed distance between them enhance the power of retrieval  the words may or may not be required to appear in the same order as in the query

5 Boolean queries   e 1 BUT e 2 NOT e 2

6 Pattern matching data retrieval capabilities as enhanced tools for IR types of patterns  word: computer  prefix of a word: comput  suffix of a word: ter  substring of a word: ute  range formed by two words in lexicographical order: communication and computer

7  word with an error threshold edit distance: minimum number of character insertions, deletions, and replacements needed to make the query and the target equal computeers computational biology unit cost edit distance w(a  b)=1, a  b (replacement) w(a  )=w(  b)=1 (deletion and insertion)

8 given any two strings S 1 =abac, S 2 =aaccb compute by dynamic programming method  from x to y  H: delete; V: insert; C: replace the edit distance is 3 abac 01234 a10123 a21112 c32221 c43332 b54343 a b a c c b a a c c b (1 deletion, 2 insertions)

9  regular expression a regular expression is a pattern built up by simple strings and the union, concatenation and repetition operators pro(blem ︱ tein)(s ︱ ε)(0 ︱ 1 ︱ 2)*


Download ppt "Modern Information Retrieval Chapter 4 Query Languages."

Similar presentations


Ads by Google