Download presentation
Presentation is loading. Please wait.
Published byErick Reed Modified over 9 years ago
1
CQL “Common Query Language” Ray Denenberg March 2005
2
CQL’s Goals Combine the simplicity and intuitiveness of google searching with the expressive power of Xquery. Support very simple queries; and arbitrarily complex expressions as necessary. Example: search on “cat”
3
cat
4
(That’s it. The whole query.)
5
Simple CQL Queries cat cat and dog title = cat
6
Simple CQL Queries cat (simplest) cat and dog (simple boolean) title = cat (index)
7
Simple CQL Queries cat (simplest) cat and dog (simple boolean) title = cat (index) dc.title = cat (index qualified)
8
Boolean cat and dog cat or dog Cat not dog
9
Boolean cat and dog cat or dog Cat not dog cat not dog and fish or frog
10
Boolean cat not dog and fish or frog evaluates to: (((cat not dog) and fish) or frog)
11
Boolean cat not dog and fish or frog evaluates to: (((cat not dog) and fish) or frog) Not: (cat not dog) and (fish or frog)
12
index Search title = cat
13
Qualified index title = cat dc.title = cat bib.title = cat Bath.keyTitle Bath.
14
Fielded/index Search dc.title = cat bib.title = cat
15
dc.title A name given to the resource bib.title (fictitious) A word, phrase, character, or group of characters, normally appearing in an item, that names the item or the work contained in it.
16
Zthes Indexes zthes.nt=sauropod and zthes.bt=macronaria narrower than sauropod but broader than macronaria.
17
Relations
18
Search Clause The triple: Is called a: (e.g. title = cat)
19
Relations
20
Simple Relations Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"
21
the = relation Title = "the complete dinosaur“ ( find these three words, adjacent and in this order)
22
Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“ and “the complete dinosaur goes to Paris“
23
= Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“ and “the complete dinosaur goes to Paris“ but not “the complete and unabridged dinosaur"
24
All Title all "complete dinosaur“ matches “the complete and unabridged dinosaur“ does not match “the unabridged dinosaur“
25
Title all "dinosaur bird reptile“ does not match “the complete dinosaur"
26
Any Title any "dinosaur bird reptile“ does match “the complete dinosaur" and “the unabridged dinosaur"
27
Exact title exact "the complete dinosaur" matches "the complete dinosaur"
28
Exact title exact "the complete dinosaur" matches "the complete dinosaur" Does not match: “a day in the life of the complete dinosaur or “the complete dinosaur goes to Paris“ or “the complete and unabridged dinosaur “
29
Relations …. observations
30
Observation 1: Shorthand
31
title all "old man sea" same as title="old" and title="man" and title="sea"
32
Relations …. observations Observation 2: Anchoring ^ The anchor character
33
Recall ……. Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“
34
Anchoring title=" ^ the complete dinosaur" would not match “a day in the life of the complete dinosaur”
35
Anchoring title=" ^ the complete dinosaur" would not match “a day in the life of the complete dinosaur” title="the complete dinosaur^" would not match “the complete dinosaur goes to Paris”
36
Relations …. observations Observation 3: Index and Relation go together
37
Index and Relation go together Cat Title = cat
38
Index and Relation go together Cat Title = cat Title cat = cat
39
Index and Relation go together Cat Title = cat Title cat = cat
40
BNF searchClause ::='(' cqlQuery ')‘ | index relation searchTerm | searchTerm
41
Basic Relations …. summary Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"
42
A few more relations … < less > greater <= less or equal >= greater or equal = (see next) <> not equal
43
= relation = means: word adjacency, when the term is a list of words. Equality, otherwise.
44
Relation Modifiers Stem relevant Fuzzy phonetic
45
Stemming title =/stem "these completed dinosaurs“ matches The Complete Dinosaur.
46
Relevance subject any/relevant "fish frog" would find records whose subject field included words like shark, tuna, coelocanth, toad, amphibian, etc.
47
Relation Modifiers Stem relevant Fuzzy phonetic
48
fuzzy Fuzzy means: “be liberal in what you count as a match … details left to the server. Might include permutations of character order, off-by-one for numerical terms.” Title =/fuzzy “sharlot simmins” might match “I am Charlotte Simmons” telephoneNumber exact/fuzzy “303 441 1319"
49
Relation Modifiers Stem relevant Fuzzy phonetic
50
Phonetic Match words that sound the same e.g. Hostel might match “hostile”
51
Booleans And Or not
52
Booleans And Or Not Proximity
53
And cat and dog Or cat or dog Not cat not dog Proximity cat prox dog
54
And cat and dog Or cat or dog Not cat not dog Proximity cat prox dog roughly: “find cat near dog”
55
Proximity (chestnut prox “ Cryphonectaria parasitica”) prox (“dutch elm” prox Ceratocystisulmi)
56
Proximity parameters relation Distance unit ordering
57
Proximity parameters relation Distance unit ordering e.g: “Find cat in the same sentence as dog” Relation: less or equal Distance: 0 Unit: sentence Ordering: unordered
58
relation (" "," =","=", "<>"; default "<="), distance (integer; default: 1 for word, zero otherwise) unit ("word", "sentence", "paragraph", or "element"; default "word"), ordering ("ordered" or "unordered"; default "unordered")
59
“Find cat in the same sentence as dog” cat prox//sentence dog
60
“Find cat in the same sentence as dog” cat prox//sentence dog same as: cat prox/<=/0/sentence/unordered dog
61
(chestnut prox//sentence “ Cryphonectaria parasitica”) prox//paragraph (“dutch elm” prox//sentence Ceratocystisulmi)
62
(chestnut prox//sentence “ Cryphonectaria parasitica”) prox//paragraph (“dutch elm” prox//sentence Ceratocystisulmi) ( find chestnut in the same sentence as “Cryphonectaria parasitica”, and “dutch elm” In the same sentence as Ceratocystisulmi, and both sentences in the same paragraph.)
63
(chestnut prox//paragraph “ Cryphonectaria parasitica”) and (“dutch elm” prox//paragraph Ceratocystisulmi)
64
(chestnut prox//paragraph “ Cryphonectaria parasitica”) and (“dutch elm” prox//paragraph Ceratocystisulmi) ( find chestnut in the same paragraph as “Cryphonectaria parasitica”, and “dutch elm” In the same paragraph as Ceratocystisulmi.)
65
cat prox/>/2//ordered hat retrieves “cat in the hat” but not “cat in hat” nor “hat on the cat”
66
Pattern Matching ? Matches any single character * Matches any sequence of zero or more characters ^ word-anchoring
67
Pattern Matching ? Matches any single character c?t matches cat, cot, cut, but not coat or ct. c??t matches cart, but not cat or crypt. * Matches any sequence of zero or more characters c*t matches cat, coat, crypt and counterargument. ^ word-anchoring ---
68
Word Anchoring title=" ^ the complete dinosaur" Matches “the complete dinosaur meets godzilla” But not “a day in the life of the complete dinosaur” title="the complete dinosaur ^ “ Matches a day in the life of the complete dinosaur” But not “the complete dinosaur meets godzilla”
69
Word Anchoring - any title any " ^ cat ^ dog rat“ Means title with cat at the beginning, or with dog at the beginning,or with rat anywhere.
70
Word Anchoring - any title any " ^ cat ^ dog rat“ Means title with cat anywhere, or with rat anywhere, or with dog at the beginning. matches 'cat eats dog', 'dog eats hat' ‘hat eats rat’ but not ‘hat eats dog'
71
CQL Syntax Reserved words: and, or, not, prox Special Characters Space ( ) = ” /
72
Tokens A string that has no special characters; or Any string at all enclosed by double quotes. (Except the string cannot include a double quote, unless escaped.)
73
Escape Character \ Backslash (\) escapes '*', '?', " and '^', as well as itself "\“why not\?\" she said" Results in the following token: “why not?" she said
74
Context sets
75
Indexes Relations Relation modifiers Boolean Modifiers
76
subject any/relevant "fish frog"
77
indexrelation Relation modifier Search term
78
subject any/relevant "fish frog" indexrelation Relation modifier Search term Subject to context qualification
79
dc.subject any/relevant "fish frog" Context set
80
dc.subject any/relevant "fish frog"
81
dc.subject any/rel.lr "fish frog"
82
A specific Relevance algorithn Context set
83
dc.subject cql.any/rel.lr "fish frog" Context set
84
Example–fictitious relation: “only” depicts only “cat" Matching images would depict only a cat and nothing else. The same cat with a person would not match. index relation
85
image.depicts image.only “cat" Context for index Context for relation
86
subject any/relevant "fish frog" Go back to:
87
subject any/relevant "fish frog" title any/relevant “cat dog" Or
88
subject any/relevant "fish frog" title any/relevant “cat dog" Or/rel.mean
89
subject any/relevant "fish frog" title any/relevant “cat dog" Or/rel.mean Boolean modifier Context set
90
Defaults Consider the query: cat The server needs to turn that into a search clause, I.e. an index, relation, and search term. As it is, there’s only a search term
91
cat cql.scr (default context set and relation) scr: “server choice relation” cql.serverChoice (default index)
92
Next, consider the query: title = cat
93
Next, consider the query: title = cat The server needs to assign a context set to the index (title) and a context set to the relation (=)
94
Next, consider the query: title = cat The server needs to assign a context set to the index (title) and a context set to the relation (=) Or to make it even more complicated….
95
Add a relation modifier title = cat/relevant The server needs to assign a context set to the index (title) and a context set to the relation (=), and a context set to the relation modifier.
96
Default Context Sets <>.title cql.= cat/cql.relevant Default index seleted by server Default context set for relation is ‘cql’ Default context set for relation modifier is ‘cql’‘cql’
97
Additional relation modifiers word The term should be broken into words, (according to the server's definition of a 'word‘) string The term is a single item, and should not be broken up. isoDate Each item within the term conforms to ISO 8601 number Each item within the term is a number. uri Each item within the term is a URI. masked (default modifier)
98
Title any “cat dog” same as Title any/word “cat dog”
99
Title any “cat dog” same as Title any/word “cat dog” Title exact “cat in the hat” same as title exact/string “cat in the hat”
100
Title any “cat dog” same as Title any/word “cat dog” Title exact “cat in the hat” same as title exact/string “cat in the hat” Title = “cat * hat” same as Title =/masked “cat * hat”
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.