Requirements Engineering for Semantic CMS

Requirements Engineering for Semantic CMS
Lecturer Organization Date of presentation Copyright IKS Consortium

(1) (2) (3) (7) (4) (8) (5) (9) (6) (10) Part I: Foundations
Introduction of Content Management Foundations of Semantic Web Technologies (1) (2) Part II: Semantic Content Management Part III: Methodologies Knowledge Interaction and Presentation Requirements Engineering for Semantic CMS (3) (7) Knowledge Representation and Reasoning Designing Semantic CMS (4) (8) Semantic Lifting Semantifying your CMS (5) (9) Storing and Accessing Semantic Data Designing Interactive Ubiquitous IS (6) (10) Copyright IKS Consortium

What is this Lecture about?
We have seen ... ... existing technologies of the Semantic Web ... how these technologies can be used for semantic content management What is missing? Methodologies for the development of semantic CMS First, requirements for semantic CMS need to be specified Part III: Methodologies Requirements Engineering for Semantic CMS (7) Designing Semantic CMS (8) Semantifying your CMS (9) Designing Interactive Ubiquitous IS (10) Copyright IKS Consortium

Outline What the course is about? Methodology
Understand industry needs/expectations Analysis of Traditional CMSs Identify business scenarios Identification of High Level Requirements (HLRs) High Level Requirements Use cases Resulting requirements Summary Copyright IKS Consortium

What the course is about?
What the course is about? This course aims to Give the details of the domain-independent requirement elicitation process of semantic enhancement of any Content Management System This course aims to present the steps that are taken to elicit requirements of the framework aiming to enhance traditional content management systems with semantic capabilities. It will first give common steps that are done between system designers and target groups i.e the group delivering the work and the target group (CMS vendors in this case) to determine the actual needs. After going over the agreements done between the two groups the focus will be on the high level requirements elicited. The course will try to give importance of high level requirements from a semantic perspective. Copyright IKS Consortium Copyright IKS Consortium

Methodology Bilateral meetings with CMS vendors
Methodology Bilateral meetings with CMS vendors Workshops Interviews Brainstorming sessions Gathered requirements Categorization under major topics High Level Requirements Use cases Validate the resulting use cases against the requirements of different CMS vendors In the first phases of requirement elicitation process, bilateral meetings with the target group are done. These meeting can be workshops, interviews, brainstorming session, etc. Apart from bilateral meetings, carrying out questionnaires and survey can also be considered as methods to obtain, understand the needs of target groups in requirement analysis phase. For the time being, as our consideration is content management systems and as it is desired to provide services on top of existing systems, it would not be realistic if the existing systems would not have been examined. The better understanding of CMS vendors’ needs, the more fluent advancement later in design, implementation phases of the project. As long as collecting the needs, requirements of target groups, the collected materials are categorized into topics. For example put “Extract RDF statements from XML or HTML document” statement into “Enrichment of Content” topic and “Possibility to create pages by queries” statement into “Support for content creation” topic. The topics identified lead to high level requirements and use cases the refinement process. To ensure that needs of different CMS providers are considered, a cross validation is done between the resulting use cases and requirements of CMS providers. Copyright IKS Consortium Copyright IKS Consortium

Results Requirements Engineering Process Actors model
Results Requirements Engineering Process Refine HLRs into specific software requirements using scenario and use case descriptors Actors model All requirements are based on use cases which use a common actor’s model for CMS. Integration of semantic services to existing CMSs Easy to use and technology independent mechanisms RESTful HTTP services All features are expressed in terms of services Applicable to and can be accessed by “any” CMS Mash-up to create new high-order services Elicited high level requirements are refined into use cases. All requirements are based on use cases which use a common actor’s model for CMS because this model is the basis for the communication between the consortium, which contains producer and consumer groups, about different use cases and which actors are involved. Easy adoption and technology independence is one of the major concerns of the CMS industry. Because CMS providers want to spend the minimum effort to integrate semantic services onto their existing systems. Using RESTful services through the HTTP protocol for accessing to semantic services independent from the underlying technology required by them is a worldwide accepted modality to provide easy integration. Functionality provided by the system should be accessible by adopters through RESTful services. These services should be applicable for any domain e.g touristic domain or health care domain. These services can be reused to create new higher-order services, the system can be extended by new services for semantic features, and services can be replaced. This setup allows a modular development of the IKS and enough flexibility to experiment with different implementations of semantic services. Additionally, each service is required to define further extension points to allow fine grained customization of all semantic features. Copyright IKS Consortium Copyright IKS Consortium

Analysis of Traditional CMSs
Analysis of Traditional CMSs GOAL: Identify common parts that all CMSs have INPUT: Product descriptions Expectations from industry Product web-sites Running CMS itself As it is desired to provide semantic functionalities that will be used by any kind of content management systems, common parts of existing content management systems should be identified. There are different kinds of input to be considered namely, product descriptions, expectations from industry, product web sites. Furthermore, existing CMSs can be analyzed by directly running and investigating the features. Copyright IKS Consortium Copyright IKS Consortium

Analysis of Traditional CMSs
Analysis of Traditional CMSs Analysis of Content Types Content Workflow Content Services Architectural Styles Copyright IKS Consortium Copyright IKS Consortium

CMS – Content Types Documents Web sites / web applications
CMS – Content Types Documents Web sites / web applications Multi-media files (audio, image, video) Postings + Comments ( blogs, forums) Short messages (sms, twitter) Topics (wiki) Correspondence ( , newsletter) Feeds (rss) Individuals (social networking) These are the possible content types that are managed by content management systems. Not all of them support all kinds of these items; but a system aiming to be quite generic for existing systems should be capable to support these content types. Copyright IKS Consortium Copyright IKS Consortium

CMS – Content Workflow Main innovations take place in the phases
CMS – Content Workflow Main innovations take place in the phases Enrichment Storage Publishing The figure in the slide shows the generic content workflow within in a content management systems. The important steps of this workflow in terms of semantic enhancement are the last 3 steps. Following 3 slides list the topics that are identified for each step. Copyright IKS Consortium Copyright IKS Consortium

CMS – Content Workflow Semantic Enrichment
Automatic classification and routing Faceted classification Use of predefined taxonomies Automatic semantic tagging Automatic ranking Semi-automatic annotations Annotation with Microformats Ontology extraction Concepts, people, places etc… extraction Relationship extraction (isA, partOf) Cross-Source Correlations Document models from ontologies Knowledge representation Copyright IKS Consortium

CMS – Content Workflow Persistence
Workflow states Relations Directories Audit User preferences Converted content Synchronization of Content Repository Semantics with Semantic Persistence Stores Copyright IKS Consortium

CMS – Content Workflow Publishing
Semantic Framework Semantic workflows Collaborative content management Semantic & social techniques Personalization of UI Knowledge view Administration Content CMS Component CMS Component Copyright IKS Consortium

CMS – Content Workflow Search
External semantic search Pluggable Ambiguity resolution Similarity searches Semantic based and multilingual Relationship recognition Keyword search Natural language queries Ranked search results Faceted search Copyright IKS Consortium

CMS – Existing Content Services
Creation / Ingestion Ingestion schedules Transcoding Indexing Metadata extraction Storing Versioning Audit / Archive Workflow Management Publishing Notification Search and Query Rendering User profiles Security Ad service Copyright IKS Consortium

Traditional CMS Architecture
Traditional CMS Architecture The figure and explanation is adapted from the study: Fabian Christ, Benjamin Nagel: A Reference Architecture for Semantic Content Management Systems Starting from the user interface layer, A CMS User Interface at the top layer in the figure presents the content and offers editorial features to create, modify, and manage content within its lifecycle. Access to the content itself is provided by a Content Access layer. This layer is used by the User Interface to get access to the content and the content management features of the CMS. Additionally, the Content Access layer can be used by third party software that may want to integrate the CMS into other applications. The core management features are implemented in the Content Management layer. This layer provides functionalities for the definition of the domain or application specific Content Data Model. The Content Data Model layer is conceptually placed below the Content Management layer that has the necessary features to manipulate the model. The Content Data Model is the application specific model based on the underlying Content Repository. The Content Repository defines the fundamental concepts and persistence mechanisms for any Content Data Model that is defined on top. The Content Management features are tightly related to the Content Administration layer to administer the CMS stack. The question was how new functionality provided by the semantic services may be integrated in this architectural scenario. The idea is to offer a set of semantic services that can be easily used by a standardized communication protocol. This approach is agreed and supported by the CMS vendors who would like to see simple RESTful interfaces to these semantic services. The new situation is depicted in the next slide. Copyright IKS Consortium Copyright IKS Consortium

Semantic CMS Architecture
Semantic CMS Architecture The figure and explanation is adapted from the study: Fabian Christ, Benjamin Nagel: A Reference Architecture for Semantic Content Management Systems The figures shows the architecture that enables traditional CMSs enhancing their systems with semantic capabilities without a major change in the existing system. The adaptation can be examined in 4 layers defined by an SCMS which are Presentation & Interaction, Semantic Lifting, Knowledge Representation and Reasoning and Persistence. In a traditional CMS, the user is able to edit and consume content through a user interface. When dealing with knowledge in Semantic CMS (SCMS) we need an additional layer at the user interface level that allows a user to interact with content, called Semantic User Interaction. For example, a user writes an article and the SCMS recognizes the name of a person in that article. An SCMS includes a reference to an object representing that person – not only the person’s name. The user can interact with the person object and see, e.g. its birthday. In Semantic Lifting layer, SCMS provides algorithms for semantic metadata extraction from the stored content which is a missing capability of traditional content management systems. After lifting content to a semantic level this extracted information may be used as inputs for reasoning techniques in the Reasoning layer. To handle knowledge within the system we use Knowledge (representation) Models that define the semantic metadata used to express knowledge. These metadata are often defined along some ontology that specifies so-called concepts and their semantic relations In the Persistence layer, as triple stores are used to store knowledge that is represented by triples (subject, predicate, object) indicating a relation between subject and object. To be able to give a semantic meaning to a triple, there should be Knowledge Models on top of knowledge repository to specify the semantic meaning of a certain predicate. Copyright IKS Consortium Copyright IKS Consortium

Merge All Inputs Workshops Brainstorming sessions
Collected list of statements from CMS vendors Representing their view on a semantic CMS e.g. legacy data, how to semantify them? e.g. tagging, different for each person, rules for personalized tagging Examination of existing systems Focus on industrial needs rather than theoretical thinking Merge all input and come up with High Level Requirements Copyright IKS Consortium

High Level Requirements
HLR-1: Common Vocabulary HLR-2: Architecture and integration HLR-3: Semantic lifting & tagging HLR-4: Semantic search & semantic query HLR-5: Reasoning on content items HLR-6: Links/relations among content items HLR-7: Workflows HLR-8: Change management, versions and audit HLR-9: Multilingualism HLR-10: Security Copyright IKS Consortium

The refinement process
The refinement process After defining the high level requirements (HLR) each HLR is refined using the following refinement process. The process starts with the HLR, produces use cases (UC), and results in lists of testable software requirements (R) for the system to be developed. The figure in the slide depicts the refinement graph as an directed acyclic graph (DAG) that emerges from this process. Start with HLRs and ends with testable software requirements Copyright IKS Consortium Copyright IKS Consortium

The refinement process
The refinement process The requirements refinement process iterates over all HLRs. For each HLR two refinement steps are performed. The process is depicted in the next figure. The first refinement is to specify scenarios and to extract and consolidate use cases from these scenarios. The result is a set of scenarios and use cases for each HLR. The use case consolidation is important to identify relationships between use cases and to keep them consistent among each other. The second refinement step is to extract and consolidate the resulting requirements. The software requirements result from the use cases, so that each use case relates to one or more software requirements. A key characteristic of these requirements is their testability. For this the requirements are formulated as simple statements like "The system shall be able to...". This formulation is key word based according to [RFC2119] (see section 4). The refinement process is implemented as an open participation process that supports constant input from the involved target groups. The process coordination and consolidation of the input was done by the research partners, who also made proposals for the requirements based on the input of the industrial partners. To achieve this in a distributed setup of partners, the documented results were published online at any time with the opportunity for the partners to add comments and make further suggestions. Copyright IKS Consortium Copyright IKS Consortium

For a common understanding for users
HLR 1 Common Vocabulary For a common understanding for users Relating a content item with clear and precise vocabulary items Services and engineering of External ontologies, taxonomies, thesauri 4 scenarios upon the collected information e.g. statements from CMS vendors “Agree on a set of categories and relations, attributes as the default set” “Help in finding good vocabularies” In order to be able to support semantic services on top of the CMS, there needs to be support for common vocabularies, which will constitute a common understanding for users by relating a content item with clear and precise vocabulary items. These vocabularies can be external ontologies, taxonomies, thesauri, and they can provide horizontal or domain knowledge. Therefore, services for engineering of such vocabularies within the system are a key requirement. These vocabularies will be utilized in the system services for providing semantic capabilities. Copyright IKS Consortium Copyright IKS Consortium

HLR 1 Common Vocabulary Use Cases
HLR 1 Common Vocabulary Use Cases The figure on this slide shows the refined use cases from a high level requirement. This is the first step in the refinement process. Copyright IKS Consortium Copyright IKS Consortium

HLR 1 Common Vocabulary Resulting Requirements
HLR 1 Common Vocabulary Resulting Requirements Functional requirements The Vocabulary shall be navigable … Data requirements Vocabulary shall be in one of standard format which Integration requirements Vocabulary shall be in an accepted standard format Interface requirements: an interface shall be implemented for Presenting list of Vocabularies Non functional requirements Vocabularies shall always be accessible In the second step of refinement process, detailed, different kinds of requirements are extracted from the use cases and scenarios that are consolidated in the first step of refinement process. Copyright IKS Consortium Copyright IKS Consortium

HLR 2 Architecture and integration
HLR 2 Architecture and integration Easy integration of services to be developed into different heterogeneous system environments RESTful service interfaces The implementation should be as technology independent as possible Should also provide technology specific access to the services for best performance results To allow easy integration of system functionalities into different heterogeneous system environments all provided functions should be accessible through RESTful service interfaces. So the architecture should be based on a service approach. The implementation should be as technology independent as possible on the one hand and on the other hand provide technology specific access to the services to guarantee best performance results. Copyright IKS Consortium Copyright IKS Consortium

HLR 2 Architecture and integration
HLR 2 Architecture and integration Everything should be accessed by an URI Linked Data approach The communication should be based on standardized text-based data formats e.g. XML The mantra behind the idea of providing each functionality through RESTful services is that everything (data, functions, etc.) inside the system stack can be accessed by an URI. The system services need access to information that are inside the data repository of the CMS. Therefore, the system defines data access interfaces that must be supported by the CMS that integrates the system. The communication is based on standardized text-based data formats, e.g. XML. Copyright IKS Consortium Copyright IKS Consortium

HLR 3 Semantic lifting & tagging
HLR 3 Semantic lifting & tagging Semantic tagging on content items Ontological classes RDF properties Microformats Extract semantics from structures and unstructured data automatically or semi-automatically Make suggestions about annotations Navigate on the content items in a semantic fashion The system to be developed should provide services to enable semantic tagging on the content items with the semantic technologies such as ontological classes, RDF properties, microformats etc... The system attaches importance to providing horizontal services to extract semantics from structured and unstructured data automatically or semi-automatically, make suggestions about the annotations and to navigate on the content items in a semantic fashion etc... Copyright IKS Consortium Copyright IKS Consortium

HLR 4 Semantic search & semantic query
HLR 4 Semantic search & semantic query Faceted search mechanisms in top of semantic query language support Statements from the industry Similarity search, similarity detection User friendly RDF query Support for disambiguation of search One of the key outcomes of semantic enhancements on CMSs can be observed through the semantic query and search functionalities of the system. Faceted search mechanisms on top of semantic query language support form the key requirements of this perspective. Having semantic information about content should be used to improve the search capabilities. With semantic data the system should extend the traditional search functionality to allow new ways of formulating search criteria and to provide "better" search results. Copyright IKS Consortium Copyright IKS Consortium

HLR 5 Reasoning on content items
HLR 5 Reasoning on content items Extracting implicit information from the explicit information residing in the content repositories “Semantic consistency check in CMSs” Extracting implicit set of data from the explicit information residing in the content repositories is a key requirement for horizontal services of the system to be developed. Reasoning on content managed by CMSs may reveal implicit relations, similarities between different content items that can be interested by the users. Furthermore, reasoning can be used in processes like consistency checking, auto categorization, etc. Copyright IKS Consortium Copyright IKS Consortium

HLR 6 Links/relations among content items
HLR 6 Links/relations among content items Along with the semantic annotations of the content items, semantic relations among them should also be considered “Instance linking, linked data cloud, whenever we create something link it with something existing” Besides tagging in combination with ontological means, content entities can be (statically) linked. This process can be automated by algorithms that reason on the provided tags and ontologies. Content items are linked among each other during their lifecycles by the help of relevant services inside a CMS. These links/relations are needed to be handled by the semantic services of the system. Along with the semantic annotations of the content items, semantic relations among them should also be considered. As linking is already a standard technique in CMSs the system to be developed should therefore focus on automatic link creation by playing on semantic algorithms and data. Copyright IKS Consortium Copyright IKS Consortium

Control flow/lifecycle of the content
HLR 7 Workflows Control flow/lifecycle of the content Workflows for semantic actions similar to workflows for content “Intelligent content workflows, configured based on organization, hierarchy” Most CMS system have their own workflow management system to control the flow and lifecycle of content. The system should offer services that can be used to implement/extend a workflow management as part of the CMS. Additionally the system should provide workflows for semantic actions similar to workflows for content. By this the user can describe a workflow which defines the semantic reasoning algorithms and semantic extraction algorithms that will be applied on a new content entity. Copyright IKS Consortium Copyright IKS Consortium

HLR 8 Change management, versions and audit
HLR 8 Change management, versions and audit The system should also be aware of changing content and provide solutions to invalidate semantic data Prior extracted semantic information might become invalid as the content changes Content evolution Semantic data evolution asdhttp://visiongss.com Like traditional CMS provide the functionality for content versioning and audit, the system must provide this concept for semantic information. All services provided by the system should log their actions in a way that they are comprehensible for a user (transparency) and the service should provide the possibility to undo an action. The system should also be aware of changing content and provide solutions to invalidate semantic data, e.g. a prior extracted semantic information might become invalid as the content changes. The problem of content evolution will become to a problem of semantic data evolution. The mentioned functionalities are not specific to an application domain of a CMS. Therefore, these services should be provided horizontally. Copyright IKS Consortium Copyright IKS Consortium

Language support independent of the CMS application domain
HLR 9 Multilingualism Services to be provided should be aware of content in different languages Enabling a variety of users in different nationality Language support independent of the CMS application domain The semantic services to be provided by the system should be aware of content in different languages and provide functions to reason about information even if they are in different languages. Furthermore, the services provided by the system needs to support multilingualism for enabling a variety of users in different nationality to use the system. Multilingualism is an requirement of the horizontal services of the system as language support independent of the CMS application domain unless the CMS is not designed for a specific language. Copyright IKS Consortium Copyright IKS Consortium

The system must consider existing access control restrictions in CMSs
HLR 10 Security The system must consider existing access control restrictions in CMSs New kinds of restrictions which reflect the semantic data access e.g. for algorithms that reason on existing data Integration of permission, role and group models In CMS the content access can be configured using more or less fine grade access controls. When using semantic algorithms the system must consider these existing access control restrictions. Additionally the service may consider new kinds of restrictions which reflect the semantic data access, e.g. for algorithms that reason on existing data. The system needs a concept how to integrate permission, role and group models that normally exists as part of a CMS. Copyright IKS Consortium Copyright IKS Consortium

Summary The requirements evolved from a systematic requirements engineering approach Started with the analysis of current CMS systems and their similarities Collection of needs of CMS vendors in the field of semantic enhancements of their systems Workshops Brainstorming sessions Interviews From the High Level Requirements (HLRs) Necessary Actors are defined Scenarios are constructed Copyright IKS Consortium

Summary From the scenarios for each HLR
Use cases are extracted From the use cases resulting requirements are refined into the following types of requirements Functional Data Integration Interface Non functional Copyright IKS Consortium

Requirements Engineering for Semantic CMS

Similar presentations

Presentation on theme: "Requirements Engineering for Semantic CMS"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Requirements Engineering for Semantic CMS

Similar presentations

Presentation on theme: "Requirements Engineering for Semantic CMS"— Presentation transcript:

Similar presentations

About project

Feedback