Download presentation
Presentation is loading. Please wait.
Published byBenjamin Walton Modified over 7 years ago
1
Querying for Metadata 13th November 2013 Andy Hind, Alfresco
2
What’s in the box?
3
Where is the box? Context matters Stuff may be added to your query
4
The bottom of the box … Java Search API Search sub-system
Query languages AFTS / CMIS QL Abstract form Index Engine SOLR/Lucene/Nothing Lucene/Xpath/… IN: Query + Context OUT: List of nodes you can see Abstract form can be used for other query engines Concepts: Logical operations, Exact match, term match, order, type 4.1 Alfresco/SOLR
5
The new tool in the box … Java Search API Search sub-system
Query languages AFTS / CMIS QL Abstract form DB Engine DB Index Engine SOLR/Lucene/Nothing Lucene/Xpath/… 4.2
6
1 2 A choice … Java Search API Search sub-system Query languages
AFTS / CMIS QL Abstract form DB Engine DB Index Engine SOLR/Lucene/Nothing Lucene/Xpath/… 1 2 DB Specific QL Index Specific QL Switching
7
DB SOLR Consistency DB – transactional - immediate consistency
Lucene one node Lucene in a cluster Special SOLR index DB engine Could replace canned queries Existing schema
8
CMIS QL FROM SELECT JOIN ORDER BY WHERE CMIS QL Key use case
Easy to define what is supported
9
CMIS QL: SOLR vs DB Main restrictions Index size Performance
Full text OR decimal boolean IN_TREE() DB Main restrictions Index size Performance Pain/Difficulty
10
Virtual tables Not cmis:item, cmis:policy, cmis:relationship
cmis:document … cmis:folder cmis:secondary Not cmis:item, cmis:policy, cmis:relationship
11
Columns/data types Supported String <= 1024 integer id datetime Unsupported boolean decimal uri Html String > 1024 DB Boolean, long, float, double, string, BLOB Things not in the node properties table – UUID/ID, mimetype, content length Properties in general Properties on cmis:document and cmis:folder Mimetype Size String length - data dependent
12
Logical Operators Good AND Bad NOT ANY NOT ANY Ugly OR
AND – generally selective NOT – unselective ANY – implies multi-valued – could match multiple rows NOT ANY – will most likely match multiple rows OR – excluded – optimisation is difficult - consider all rows – SOLR good Semi-join (reduce the row count but can not reuse the join) Select and order
13
Predicates All that apply to the type as in the spec Comparison .. ANY
= <> < <= > >= All that apply to the type as in the spec
14
Predicates Clarity: Caution NOT, IS NULL LIKE – leading wildcards
ANY … (NOT) IN IN_FOLDER() IS (NOT) NULL LIKE (NOT) IN IN_TREE() SCORE() CONTAINS() Clarity: Caution NOT, IS NULL LIKE – leading wildcards
15
Ordering DB variation Beware large result sets
Don’t order IDs on the DB ATM
16
An Example … select * from cmis:document where cmis:name like '%e%' and cmis:createdBy in ('System', 'admin') and cmis:creationDate < TIMESTAMP ' T00:00:00.000Z' and cmis:lastModifiedBy not in ('me') and cmis:lastModificationDate > TIMESTAMP ' T00:00:00.000Z' and cmis:contentStreamLength > 2 and cmis:contentStreamFileName LIKE '_%' order by cmis:contentStreamLength DESC, cmis:creationDate ASC, cmis:name DESC Virtual Folders Be selective – specific type, PARENT, JOINs TO ASPECT, =
17
CMIS QL: SOLR vs DB Now we understand this a bit better SOLR Full text
OR decimal boolean IN_TREE() DB Now we understand this a bit better
18
The two are not the same …
SOLR DB The two are not the same … If both can do the same query the answer may not be the same Default DB order, index order, score order not a big effect but short matches better than long ones
19
Permissions In Query HIDDEN After Query
20
Localisation Case sensitivity – collation d:mltext - ignore locale
Localised order Case sensitivity d:mltext DB Collation Case sensitivity – collation d:mltext - ignore locale
21
Why I get up in the morning …
Impatient, occasional, technical, new to ECM, too busy for the training Fire: google docs broken out into Alfresco Semantic search in the future Generic query : Content: scoring AND Created:2012
22
Alfresco FTS + DB? Go though each then:
=name TYPE ASPECT PARENT AND PATH OR Implicit OR Go though each then: UI queries will not go to the DB Context - adds a PATH constraint Beware the implicit OR even if you put + in front of everything AFTS = for exact match IN: Term, phrase, prefix
23
Is it for you? SOLR … OR Eventual FTS DB …. Restricted Now
24
system.metadata-query-indexes.ignored
Optional patch system.metadata-query-indexes.ignored true Upgrade No MDQ false Optional patch DB +25% MDQ ignored New Install Repeat to emphasise
25
Upgrade 4.0.2 -> 4.2 10M InnoDB 1 Hour, patch is 10 minutes
Extra indexes + 25% (21G – 25 ) BM: Performance impact of extra indexes minimal – may be a few %
26
Configuration Java API solr.query.fts.queryConsistency
solr.query.cmis.queryConsistency EVENTUAL TRANSACTIONAL TRANSACTIONAL_IF_POSSIBLE Java API
28
Is this a box of worms? Transfer Permission Ordering text
Performance Large result sets (~100k) Left outer join ORDER BY DB ≈ SOLR Transfer Permission Ordering text disk – RAM disk Subtle differences
29
Do share queries use the DB?
NO Context +PATH Implicit OR Node browser JAVA API Unadulterated search Chemistry/OpenCMIS Workbench
30
The mystery box … UPPER() LOWER() name:woof FTS Syntax db-cmis
Admin console
32
Summary SOLR Full text OR Float/double Boolean String > 1024
Structure DB
33
Permission evaluation
Future IN_TREE() SOLR 4+ Permission evaluation Performance More DBs Simple OR Date math Schema Hybrid I am not making any promises Structure SOLR
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.