IBM Research 2 Full text search is an inadequate paradigm for navigating wikis and other socially-tagged content Too many results – leads to unwanted foraging – search, review results, search, review results, search… A single-document-results orientation. In many cases users are looking for a collection of instance data to process later. Searches initiated and refined by text strings (source: users brain, not an available set of classifications). Very little context for the data – users have to navigate to results screen to determine if they have found what they need, and the results arent sorted in any meaningful way. System feedback limited to number of hits and the hits themselves Task completion much lower than faceted logic – less than 50% versus over 90 percent. User satisfaction also much lower Large delta between experts and novices Ranking/sorting is not tied to users immediate needs.
IBM Research 3 Faceted logic is a data access paradigm for ontologies Ontologies require data access paradigms – without such paradigms, ontologies tend to fail Fully-articulated ontologies tend to generate nulls. Paring down results sacrifices semantics. Parametric queries create many more metadata management issues How to arrange hierarchies for different perspectives and user communities?
IBM Research 4 Faceted logic provides an efficient way to classify, access, update, and act on a wide variety of instance data: documents, collections, data records, and even facets themselves. Facets are abstractions … Sectors and industries Geography Topics Vendor type Service type Language Role Type of information Until they are made concrete via a root and optional descendant nodes. Example: root: global Child: Europe Descendant: France, UK, Italy Unlike traditional taxonomic classification, businesses or stakeholder groups may choose to implement only the facet elements that are relevant to their business models, stakeholder interests, or work situations Facets give users the ability to look at the world from multiple angles – by the attributes of any given set of instances. Once instantiated, facets are hierarchical or flat collections of attributes – one root node, any arbitrary number of descendant nodes for any parent, ragged structure acceptable, any descendant node typically has one and only one parent, though it may have multiples (DAGs) Facets work especially well with containment queries such as full text search The key to faceted logic is the one at a time principle – one selection at a time eliminates nulls and allows meaningful, semantically rich feedback to be presented to end users Taxonomy management, collaborative technologies, workflow, and business intelligence can be built on top of faceted metadata.
IBM Research 5 Some Considerations on Information Quality in Wiki/Social Tagging Context Some classes of metadata are more difficult to gather, apply, and maintain relative to specific cases Example: works with relations Configuration elements – unless this can be automated Local language and hierarchy Polysemy How many results are enough or too many? Does this depend on context? How methodologically can we quantify the quality and end user performance impact of classification data? Chains of dependency – collection and validation of dependencies and limitations relative to other metadata, software levels, etc. What is the scale of the assets that practitioners search? – tens of assets, hundreds, thousands, hundreds of thousands?