Presentation on theme: "?! Advanced CQL and ProfilingMike Taylor Advanced CQL and Profiling 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation."— Presentation transcript:
?! Advanced CQL and ProfilingMike Taylor Advanced CQL and Profiling 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation modifiers – Boolean modifiers 2. Profiling 3. Prefix mapping 4. Defining relations
Mike Taylor CQL features: esoterica You are not expected to understand this. – comment in the Unix Version 7 source code. The point is that new users are not required to understand this, and may happily use CQL for many years – perhaps forever – without needing to. Advanced CQL and Profiling
Mike Taylor CQL esoterica: word anchoring A word beginning with ^ must occur at the start of its field. A word ending with ^ must occur at the end of its field. dinosaur– matches the complete dinosaur dinosaur^– also matches ^dinosaur– does not match the– matches the complete dinosaur ^the– also matches the^– does not match Advanced CQL and Profiling
Mike Taylor CQL esoterica: proximity The prox boolean, by default, requires its operands to be next to each other, in either order: cervical prox vertebra – equivalent to "cervical vertebra" or "vertebra cervical" (cervical or dorsal) prox vertebra – equivalent to "cervical vertebra" or "dorsal vertebra" or "vertebra cervical" or "vertebra dorsal" Advanced CQL and Profiling
Mike Taylor CQL esoterica: proximity II Modifiers can generalise the semantics of proximity: cervical prox/distance<=5/ vertebrae – within five words of each other cervical prox/distance=0/unit=sentence vertebrae – within the same sentence cervical prox/distance>0/unit=paragraph vertebrae – in different paragraphs cervical prox/ordered vertebrae – in the specified order: exactly equivalent to "cervical vertebra" Advanced CQL and Profiling
Mike Taylor CQL esoterica: relation modifiers Modifiers can refine the semantics of relations: title =/stem dig – finds dig, digging, dug, etc. title any/relevant "dinosaur bird reptile" – finds sauropods, avian, crocodile, snake, etc. author =/fuzzy tailor – finds Mike Taylor phoneNumber exact/fuzzy " " – finds Advanced CQL and Profiling
Mike Taylor CQL esoterica: relation modifiers II Relation modifiers can be overloaded to specify extra information about the term that the relation joins to the index: createdDate >/isoDate " :45:00" – the term is in ISO 8601 format. location within/geom.polygon "(12,46) (15,52)" – the term indicates a polygon of two points (i.e. a straight line) rather than the corners of a rectangle. Advanced CQL and Profiling
Modifiers can refine the semantics of boolean operators. We've already seen some examples of this in proximity. cervical prox/distance<=5/ vertebrae – within five words of each other cervical or/exclusive vertebrae – one or the other, but not both. "denenberg or/rel.mean "information retrieval" "denenberg or/rel.sum "information retrieval" "denenberg or/rel.max "information retrieval" – average, total or maximum relevance of operands Mike Taylor CQL esoterica: boolean modifiers Advanced CQL and Profiling
Mike Taylor Profiling CQL Advanced CQL and Profiling For simple searching, it suffices to use common indexes. Semantic interoperability requires more precise behaviour. This lesson was learned in the Z39.50 world and resulted in the invention of profiles - specifications for a subset of the full specification that are needed to support an application. The classic example in Z39.50 is a Bath Profile for bibliographic searching. Similarly, we define a Bath Profile for CQL searching.
Mike Taylor Profiles and context sets Advanced CQL and Profiling A profile is not the same thing as a context set! A context set is merely a bag of indexes (and relation modifiers and boolean modifiers) that may be used in any application. A profile provides a palette of indexes drawn from several context sets. The distinction is similar to that between XML namespaces and XML Schemas. Schemas depend on namespaces, and may use several. CQL profiles depend on context sets, and may use several.
Mike Taylor Example: the Bath Profile Advanced CQL and Profiling See Bath searches may use any of the following indexes: dc.creatorbath.personalName dc.titlebath.corporateName dc.subjectbath.conferenceName cql.anywherebath.uniformTitle dc.identifierbath.issn dc.daterec.id bath.keyTitlebath.geographicName dc.formatbath.notes dc.languagebath.topicalSubject bath.possessingInstitutionbath.genreForm bath.name
Mike Taylor Existing and possible profiles Explicit CQL profiles have been created for some applications: Bath Profile for bibliographic data Zthes profile for hierarchical thesaurus navigation Profile are in development (or unwritten) for others: Google-like structureless searching Simple metadata searching with the Dublin Core CCG for collectable card games Music – musicalKey, arranger, duration, etc. GILS (Global Information Locator Service)... your application goes here! Advanced CQL and Profiling
So far, we have been free and easy with index prefixes such as dc. But how do we know what they mean? Why should dc mean Dublin Core rather than Deep Custard? dc.custardDepth <= 20 Why should bath mean the Bath Profile for bibliographic searching instead of plumbing supplies? bath.capacityInGallons > 45 Mike Taylor CQL esoterica: prefix mapping Advanced CQL and Profiling
Prefixes are just convenient, easy-to-type abbreviations. The real identifier of a context set is its URI. For example, the Dublin Core context set is info:srw/cql-context-set/1/dc-v1.1 but we map that URI to a prefix for convenience. This is exactly like XML namespaces: they are identified by URIs, but the URIs do not appear in the names of elements or attributes: short prefixes are used instead. Mike Taylor CQL esoterica: prefix mapping II Advanced CQL and Profiling
In XML, a prefix is associated with a namespace using: In CQL, a prefix is associated with a namespace using: >prefix=http://example.org/xyz/ and the rest of the query follows. The following queries are exactly equivalent: >dc=info:srw/cql-context-set/1/dc-v1.1 dc.title=fish >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Most applications will have established default mappings. Mike Taylor CQL esoterica: prefix mapping III Advanced CQL and Profiling
It is possible to establish the context set from which indexes with no explicit prefix are taken by omitting the prefix= part from the mapping: >http://example.org/heraldry/ title=baron and side=sinister So the following queries are exactly equivalent: >info:srw/cql-context-set/1/dc-v1.1 title=fish >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Mike Taylor CQL esoterica: prefix mapping IV Advanced CQL and Profiling
Finally... Finally! :-) Prefix mappings can be stacked up: >dc = info:srw/cql-context-set/1/dc-v1.1 >bath=http://zing.z3950.org/cql/bath/2.0/ >rec=info:srw/cql-context-set/2/rec-1.0 rec.created < and dc.title=ecology and bath.conferenceName=dinosaur (Yes, this is all one query.) Mike Taylor CQL esoterica: prefix mapping V Advanced CQL and Profiling
Don't try this at home. Mike Taylor CQL esoterica: prefix mapping VI Advanced CQL and Profiling
Mike Taylor Defining relations Advanced CQL and Profiling CQL has a feature where any word can act as a relation. For example, the query: foo bar baz is interpreted as index-name foo, relation bar, term baz – even though there is no relation bar. This is a misfeature. it prevents the obvious interpretation of this query as a phrase-search or AND search. If your profile needs a new relation, consider defining it as a relation modifier on one of the existing relation, instead.
?! Mike Taylor Thanks for listening! Advanced CQL and Profiling