Presentation is loading. Please wait.

Presentation is loading. Please wait.

CQL “Common Query Language” Ray Denenberg March 2005.

Similar presentations


Presentation on theme: "CQL “Common Query Language” Ray Denenberg March 2005."— Presentation transcript:

1 CQL “Common Query Language” Ray Denenberg March 2005

2 CQL’s Goals Combine the simplicity and intuitiveness of google searching with the expressive power of Xquery. Support very simple queries; and arbitrarily complex expressions as necessary. Example: search on “cat”

3 cat

4 (That’s it. The whole query.)

5 Simple CQL Queries cat cat and dog title = cat

6 Simple CQL Queries cat (simplest) cat and dog (simple boolean) title = cat (index)

7 Simple CQL Queries cat (simplest) cat and dog (simple boolean) title = cat (index) dc.title = cat (index qualified)

8 Boolean cat and dog cat or dog Cat not dog

9 Boolean cat and dog cat or dog Cat not dog cat not dog and fish or frog

10 Boolean cat not dog and fish or frog evaluates to: (((cat not dog) and fish) or frog)

11 Boolean cat not dog and fish or frog evaluates to: (((cat not dog) and fish) or frog) Not: (cat not dog) and (fish or frog)

12 index Search title = cat

13 Qualified index title = cat dc.title = cat bib.title = cat Bath.keyTitle Bath.

14 Fielded/index Search dc.title = cat bib.title = cat

15 dc.title A name given to the resource bib.title (fictitious) A word, phrase, character, or group of characters, normally appearing in an item, that names the item or the work contained in it.

16 Zthes Indexes zthes.nt=sauropod and zthes.bt=macronaria narrower than sauropod but broader than macronaria.

17 Relations

18 Search Clause The triple: Is called a: (e.g. title = cat)

19 Relations

20 Simple Relations Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"

21 the = relation Title = "the complete dinosaur“ ( find these three words, adjacent and in this order)

22 Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“ and “the complete dinosaur goes to Paris“

23 = Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“ and “the complete dinosaur goes to Paris“ but not “the complete and unabridged dinosaur"

24 All Title all "complete dinosaur“ matches “the complete and unabridged dinosaur“ does not match “the unabridged dinosaur“

25 Title all "dinosaur bird reptile“ does not match “the complete dinosaur"

26 Any Title any "dinosaur bird reptile“ does match “the complete dinosaur" and “the unabridged dinosaur"

27 Exact title exact "the complete dinosaur" matches "the complete dinosaur"

28 Exact title exact "the complete dinosaur" matches "the complete dinosaur" Does not match: “a day in the life of the complete dinosaur or “the complete dinosaur goes to Paris“ or “the complete and unabridged dinosaur “

29 Relations …. observations

30 Observation 1: Shorthand

31 title all "old man sea" same as title="old" and title="man" and title="sea"

32 Relations …. observations Observation 2: Anchoring ^ The anchor character

33 Recall ……. Title = "the complete dinosaur“ matches “a day in the life of the complete dinosaur“

34 Anchoring title=" ^ the complete dinosaur" would not match “a day in the life of the complete dinosaur”

35 Anchoring title=" ^ the complete dinosaur" would not match “a day in the life of the complete dinosaur” title="the complete dinosaur^" would not match “the complete dinosaur goes to Paris”

36 Relations …. observations Observation 3: Index and Relation go together

37 Index and Relation go together Cat Title = cat

38 Index and Relation go together Cat Title = cat Title cat = cat

39 Index and Relation go together Cat Title = cat Title cat = cat

40 BNF searchClause ::='(' cqlQuery ')‘ | index relation searchTerm | searchTerm

41 Basic Relations …. summary Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"

42 A few more relations … < less > greater <= less or equal >= greater or equal = (see next) <> not equal

43 = relation = means: word adjacency, when the term is a list of words. Equality, otherwise.

44 Relation Modifiers Stem relevant Fuzzy phonetic

45 Stemming title =/stem "these completed dinosaurs“ matches The Complete Dinosaur.

46 Relevance subject any/relevant "fish frog" would find records whose subject field included words like shark, tuna, coelocanth, toad, amphibian, etc.

47 Relation Modifiers Stem relevant Fuzzy phonetic

48 fuzzy Fuzzy means: “be liberal in what you count as a match … details left to the server. Might include permutations of character order, off-by-one for numerical terms.” Title =/fuzzy “sharlot simmins” might match “I am Charlotte Simmons” telephoneNumber exact/fuzzy “303 441 1319"

49 Relation Modifiers Stem relevant Fuzzy phonetic

50 Phonetic Match words that sound the same e.g. Hostel might match “hostile”

51 Booleans And Or not

52 Booleans And Or Not Proximity

53 And cat and dog Or cat or dog Not cat not dog Proximity cat prox dog

54 And cat and dog Or cat or dog Not cat not dog Proximity cat prox dog roughly: “find cat near dog”

55 Proximity (chestnut prox “ Cryphonectaria parasitica”) prox (“dutch elm” prox Ceratocystisulmi)

56 Proximity parameters relation Distance unit ordering

57 Proximity parameters relation Distance unit ordering e.g: “Find cat in the same sentence as dog” Relation: less or equal Distance: 0 Unit: sentence Ordering: unordered

58 relation (" "," =","=", "<>"; default "<="), distance (integer; default: 1 for word, zero otherwise) unit ("word", "sentence", "paragraph", or "element"; default "word"), ordering ("ordered" or "unordered"; default "unordered")

59 “Find cat in the same sentence as dog” cat prox//sentence dog

60 “Find cat in the same sentence as dog” cat prox//sentence dog same as: cat prox/<=/0/sentence/unordered dog

61 (chestnut prox//sentence “ Cryphonectaria parasitica”) prox//paragraph (“dutch elm” prox//sentence Ceratocystisulmi)

62 (chestnut prox//sentence “ Cryphonectaria parasitica”) prox//paragraph (“dutch elm” prox//sentence Ceratocystisulmi) ( find chestnut in the same sentence as “Cryphonectaria parasitica”, and “dutch elm” In the same sentence as Ceratocystisulmi, and both sentences in the same paragraph.)

63 (chestnut prox//paragraph “ Cryphonectaria parasitica”) and (“dutch elm” prox//paragraph Ceratocystisulmi)

64 (chestnut prox//paragraph “ Cryphonectaria parasitica”) and (“dutch elm” prox//paragraph Ceratocystisulmi) ( find chestnut in the same paragraph as “Cryphonectaria parasitica”, and “dutch elm” In the same paragraph as Ceratocystisulmi.)

65 cat prox/>/2//ordered hat retrieves “cat in the hat” but not “cat in hat” nor “hat on the cat”

66 Pattern Matching ? Matches any single character * Matches any sequence of zero or more characters ^ word-anchoring

67 Pattern Matching ? Matches any single character c?t matches cat, cot, cut, but not coat or ct. c??t matches cart, but not cat or crypt. * Matches any sequence of zero or more characters c*t matches cat, coat, crypt and counterargument. ^ word-anchoring --- 

68 Word Anchoring title=" ^ the complete dinosaur" Matches “the complete dinosaur meets godzilla” But not “a day in the life of the complete dinosaur” title="the complete dinosaur ^ “ Matches a day in the life of the complete dinosaur” But not “the complete dinosaur meets godzilla”

69 Word Anchoring - any title any " ^ cat ^ dog rat“ Means title with cat at the beginning, or with dog at the beginning,or with rat anywhere.

70 Word Anchoring - any title any " ^ cat ^ dog rat“ Means title with cat anywhere, or with rat anywhere, or with dog at the beginning. matches 'cat eats dog', 'dog eats hat' ‘hat eats rat’ but not ‘hat eats dog'

71 CQL Syntax Reserved words: and, or, not, prox Special Characters Space ( ) = ” /

72 Tokens A string that has no special characters; or Any string at all enclosed by double quotes. (Except the string cannot include a double quote, unless escaped.)

73 Escape Character \ Backslash (\) escapes '*', '?', " and '^', as well as itself "\“why not\?\" she said" Results in the following token: “why not?" she said

74 Context sets

75 Indexes Relations Relation modifiers Boolean Modifiers

76 subject any/relevant "fish frog"

77 indexrelation Relation modifier Search term

78 subject any/relevant "fish frog" indexrelation Relation modifier Search term Subject to context qualification

79 dc.subject any/relevant "fish frog" Context set

80 dc.subject any/relevant "fish frog"

81 dc.subject any/rel.lr "fish frog"

82 A specific Relevance algorithn Context set

83 dc.subject cql.any/rel.lr "fish frog" Context set

84 Example–fictitious relation: “only” depicts only “cat" Matching images would depict only a cat and nothing else. The same cat with a person would not match. index relation

85 image.depicts image.only “cat" Context for index Context for relation

86 subject any/relevant "fish frog" Go back to:

87 subject any/relevant "fish frog" title any/relevant “cat dog" Or

88 subject any/relevant "fish frog" title any/relevant “cat dog" Or/rel.mean

89 subject any/relevant "fish frog" title any/relevant “cat dog" Or/rel.mean Boolean modifier Context set

90 Defaults Consider the query: cat The server needs to turn that into a search clause, I.e. an index, relation, and search term. As it is, there’s only a search term

91 cat cql.scr (default context set and relation) scr: “server choice relation” cql.serverChoice (default index)

92 Next, consider the query: title = cat

93 Next, consider the query: title = cat The server needs to assign a context set to the index (title) and a context set to the relation (=)

94 Next, consider the query: title = cat The server needs to assign a context set to the index (title) and a context set to the relation (=) Or to make it even more complicated….

95 Add a relation modifier title = cat/relevant The server needs to assign a context set to the index (title) and a context set to the relation (=), and a context set to the relation modifier.

96 Default Context Sets <>.title cql.= cat/cql.relevant Default index seleted by server Default context set for relation is ‘cql’ Default context set for relation modifier is ‘cql’‘cql’

97 Additional relation modifiers word The term should be broken into words, (according to the server's definition of a 'word‘) string The term is a single item, and should not be broken up. isoDate Each item within the term conforms to ISO 8601 number Each item within the term is a number. uri Each item within the term is a URI. masked (default modifier)

98 Title any “cat dog” same as Title any/word “cat dog”

99 Title any “cat dog” same as Title any/word “cat dog” Title exact “cat in the hat” same as title exact/string “cat in the hat”

100 Title any “cat dog” same as Title any/word “cat dog” Title exact “cat in the hat” same as title exact/string “cat in the hat” Title = “cat * hat” same as Title =/masked “cat * hat”


Download ppt "CQL “Common Query Language” Ray Denenberg March 2005."

Similar presentations


Ads by Google