Presentation is loading. Please wait.

Presentation is loading. Please wait.

Customising OASIS CIQ Specifications V3.0 to meet end user requirements – A Case Study Ram Kumar Chairman OASIS CIQ Technical Committee Ram Kumar Chairman.

Similar presentations


Presentation on theme: "Customising OASIS CIQ Specifications V3.0 to meet end user requirements – A Case Study Ram Kumar Chairman OASIS CIQ Technical Committee Ram Kumar Chairman."— Presentation transcript:

1 Customising OASIS CIQ Specifications V3.0 to meet end user requirements – A Case Study Ram Kumar Chairman OASIS CIQ Technical Committee Ram Kumar Chairman OASIS CIQ Technical Committee http://www.oasis-open.org/committees/ciq September 2007

2 Agenda n Why this case study? n Code List l What, Why, Standard n OASIS Code List Representation TC n Methodology : Schematron based Value Validation using Genericode (from OASIS Code List TC) n OASIS CIQ TC Implementation of OASIS Code List Specifications and Methodology – A Case Study

3 Why this Case Study?

4 Why this case study? n Demonstrate how OASIS CIQ Specifications v3.0 can be customised to meet end user requirements l Without breaking the conformance to the specifications due to customisation l Improve interoperability of data defined/represented using CIQ Specifications l Define specific business rules using open industry standards to customise CIQ specifications l Define code lists of CIQ specifications using open industry standards

5 Code List

6 What is a Code List? aka enumerations, aka controlled vocabularies aka classification scheme and classification values n A set of values to choose from which represent an agreed upon semantic concept n Days of a week = {“Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, “Sun”} n Code List = List Name + values l List Name = Days of a week l Values = {“Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, “Sun”}

7 Why Code Lists are important? n It is not just elements and attribute names in XML that need to be semantically unambiguous & aligned for interoperability n The lexical form of element and attribute text content also needs to be aligned, i.e. simple data items need to be represented the same way n This is more important for applications n For data oriented XML particularly (e.g. CIM), Code Lists are as important as elements and attributes – they form part of the complete vocabulary of the document

8 Standard for Code List n If code lists were really so simple and obvious, there would be a single, well known and acceptable way of handling them in XML n There is no agreed solution, though n The problem is that while code lists are a well understood concept, people do not actually agree on exactly what code lists are, and how they should be used

9 The code list is in the eyes of the beholder n The XML schema may require only a 3-letter codes to represent the code list n The database may require a set of numeric codes, plus display labels (possibly in different languages) n The application may need to know which 3-letter code corresponds to which numeric code, so that it can process the XML and update the database n All of this code list information needs to be stored together in a single representation of the code list, so that all usages of code list can be generated from the same source information

10 The only constant is change n Code lists change n For a code list model to be useful, it has to account for the fact that the code lists will change over time n There is little use in having a code list model that works only for a code list that is frozen in time n The code list model has to support changes between versions of a code list

11 The only constant is change n Not all changes to a code list are version changes, however n Some changes may be local changes to a distributed code list n The ISO 3-letter currency code list contains GBP for British Pounds. However, prices on the London Stock Exchange are normally quoted in pence n This has led to the practice of adding an extra code to the standard ISO list (e.g. GBp, GBX) in order support pence as well as pounds n This kind of customisation is far from uncommon n The utility of any code list model is greatly reduced if it does not cater for local modifications of code lists

12 OASIS Code List Representation Technical Committee n The OASIS Code List Representation format, “genericode”, is a single model and XML format (with a W3C XML Schema) that can encode a broad range of code list information n The XML format is not designed for run-time or real-time use, but to have the standardized interchange format massaged into an optimized representation n 27 of the 40 requirements gathered are implemented in v1.0 of the specifications

13 Genericode Model n Has a tabular structure for code list information n Each row in the table represents a single distinct entry in the code list, i.e. each row represents a single uniquely identifiable item in the code list. n Each column in the table represents a metadata value that can be defined for each distinct entry in the code list. Each column is either required or optional. A required column does not allow any row to have an undefined (nil or null) value. An optional column allows undefined values. n A genericode key is a set of one or more required columns that together uniquely identify each distinct entry in the code list. Optional columns cannot be used for keys. Each code list must have at least one key. n Genericode keys are equivalent to what people usually mean when they talk about the “codes” in a code list. However, genericode allows multiple keys for each code list, and there is no single preferred key.

14 Concept n Keep code lists aka enumerations out of the core XML schema by using “schemes” n The idea is that the code lists from which an element value is taken is indicated via a “scheme” attribute containing a URI which represents the scheme (code list) n Same as the way that URIs are used to represent XML namespaces n This is done so that a newer version of core XML schema need not be released just because an externally controlled enumeration that it uses has changed (e.g. country code)

15 Methodology : Schematron- based Value Validation using Genericode

16 XML Instance Document Validation Namespace: xmlns="urn:oasis:names:tc:ciq:xNL:3 Graphical Schema View: XMLinstanceXXX.xml xsi:schemaLocation="urn:oasis:names:tc:ciq:xNL:3 ” 5.0 1967-08-13 123456 John Smith... Text view of XML instance: XML instance documents can be validated against the applicable XML Schema

17 Background (Glossary) n XML Data Content In an XML instance document, any values - between XML angles ‘>’ and ‘<’ and - between quotes of an attribute are message data content Examples: 1960-06-09 AUS Australia

18 Background (Glossary), continued Types of XML data content: n Code values n Other values (non-code values) Examples: AUS 1960-06-09

19 W3C XML Schema Limitations W3C XML Schema is mostly about data structures But it does some Data Content Validation n has good support for - data type conformity - min/max values - length, patterns … n has limited support for: - enumerations n has no support for - complex business rules - versioned changes of validation (without affecting the Schema’s version)

20 Business Rules Examples n Date Arithmetic: BirthDate < CurrentDate – 6 Years n Attribute Value Restriction: The code list value “First Name” cannot occur more than once The code list value “Last Name” cannot occur more than once n Element Use Restriction Country element cannot occur more than once, but optional n Zero-length string:

21 Business Rules Examples, continued n Code Lists the code list (+version) used by CountryCode must be an accepted code list AUS n Code Value CountryCode ‘XYZ’ must be valid in that Country code list version AUS n Co-occurrence if Status=‘Closed’ then ClosureReason must be present also Closed Obsolete

22 Data Content Validation Conclusion n XML Schema does not cover all data content validation requirements n Embedding content validation in XML Schema has undesired consequences in conjunction with re-use and Schema versioning n Business rules vary more frequently than schema constraints, and the business rules between different partners would vary where the schema constraints remain the same. n By layering value constraints on top of structural/ lexical constraints, the schemas can remain unchanged while being adapted to different partners through different value constraints n Is data content validation required ? n How can data content be validated in XML instances ?

23 Without Data Content Validation in XML A extends A Content Validation at A:Content Validation at B:- Program code- Database constraints Interoperability issues: - Validation at A equivalent to Validation at B? - Data quality of message is difficult to control - Communication of data quality issues between A & B - Relies on trust in the sender - Hard to ascertain equal interpretation of codes XML file W3C XML Document Schema Schema Validation Design Implementation Data Exchange Partner Agreement

24 With Data Content Validation in XML Sender’s and receiver's data content validation must be - electronic - portable - of shared logic and error output - platform-independent - versioned A extends A XML file XML Content Validation 2. Content Validation Design Implementation Data Exchange Partner Agreement W3C XML Document Schema 1. Schema Validation

25 With Data Content Validation in XML Sender’s and receiver's data content validation must be - electronic - portable - of equivalent logic and error output - platform-independent - versioned A extends A XML file Methodology 2. Content Validation Design Implementation Data Exchange Partner Agreement W3C XML Document Schema 1. Schema Validation

26 Methodology - Features n Code Value Validation Example: CountryCode must be a valid CountryCode n Code List Metadata Validation Examples: CountryCode must belong to an agreed, named Country Code list (+version) urn:oasis:names:tc:ciq:xNL:3:codelist:gc:Country-1 n Complex Rules Validation Examples: - BirthDate < CurrentDate - StatusCode ‘Closed’ requires a ClosureReason.

27 Methodology - Features, continued n Completely separate from W3C XML Schema n Platform-independent ISO/IEC 19757-3 Schematron (implemented using W3C XSLT stylesheets) – Open Industry Standard n Completely independent of any XML Naming and Design Rules (NDRs) n Versioning in isolation of XML Schema

28 Methodology - Process Overview Schematron-based Value Validation using Genericode Validation Coding W3C XML Validation Stylesheet transformgenerate Data Exchange Partner Agreement Data Content Validation Requirements

29 Methodology - Involved Roles Schematron-based Value Validation using Genericode Data Content Validation Requirements Validation Coding W3C XML Validation Stylesheet transformgenerate Business Analysts & Testers Users (Developers) (Data Architects) Value Validation Service Staff Run-time Operator Specialist Documentation Developers & Testers Users

30 Methodology Run-Time Components A extends A W3C XML Validation Stylesheet XML file W3C XML Document Schema(s)

31 Methodology - Value Validation n The validation process involves the use of Schematron language and XSLTs n Schematron is a rule-based XML Schema language, developed by Rick Jelliffe and internationally standardized as ISO/IEC 19757-3, using XPath expressions to describe validation rules. n Schematron is used to confirm the success or failure of a set of assertions made about XML document instances. n Schematron can be used as an adjunct to DTDs, RelaxNG or XML Schemas. It allows co-occurrence constraints, non-regular constraints, and inter- document constraints

32 Methodology - Overview

33 Methodology Data Flow Diagram

34 ABCDEFABCDEF Default Code List (gc) XSD Methodology XML structure validation Code list validation XML Validated Application A BCGHBCGH Customised Code List (gc) References CVA sch XSL Methodology - Process

35 Application of the Process in an Enterprise Enterprise Code Lists Methodology Enterprise XML Schemas Application B Customised enterprise code lists Business Rules Application A Customised enterprise code lists Business Rules

36 Methodology - Status n OASIS Code List TC draft standard 0.1 (was version 0.8 under OASIS UBL TC) n No known platform-independent alternative n Plug-and-play run-time component n Methodology can evolve without impacting run-time requirements AA W3C XML Document Schema W3C XML Validation Stylesheet

37 Methodology - Benefits n Verify that instance document is valid as per DEPA n Validate data content platform-independently n Sender and receiver get the same validation result n Simple run-time requirement (XSLT) n Strong candidate to become a global industry standard (UN/CEFACT is taking an interest) n W3C Stylesheet and Schema are industry standards n Simple run-time requirement (XSLT or Python or any other ISO Schematron implementation)

38 Methodology - Benefits, continued n Supports versioned validation in isolation of schema version n Documentation is in synch with implementation n Validation can be switched on/off as required (by msg. server or appl.) n Simplifies application coding n Simple run-time requirement allows for evolution of the methodology n Details of methodology is transparent to operations

39 Methodology - Risks n An OASIS draft standard n Methodology not widely used yet n Methodology may change or evolve n Requires Schematron and XPath expertise n Affects the XML instance document processing (extra steps) n Affects the testing of XML Schema/XSLT release packages

40 OASIS CIQ TC Case Study – Using the “Schematron-based Value Validation using Genericode” Methodology to customise OASIS CIQ Specifications v3.0

41 OASIS CIQ Technical Committee n Open Industry Specifications for defining Party Centric Data from global (international) perspective n Party – Person or Organisation l Name (241+ countries in over 36 formats) l Address/Location (241+ countries in over 130 formats) l Party Centric Attributes l Party Relationships Delivering royalty free, open, international, industry and application neutral XML specifications for representing, interoperating, and managing party (person/organization) centric information

42 Why Genericode and the Methodology for CIQ TC? n Keeps code list and values outside of the core CIQ XML Schemas n Provide users with the ability to define the semantics for the data represented in CIQ structure n Provide users with the ability to customize the CIQ XML Schemas without modifying the CIQ XML Schemas n Provides users the ability to write business rules to constrain the structure of the CIQ XML Schemas without modifying the XML schemas

43 OASIS CIQ Specifications n Party Name Schema – xNL.xsd n Supporting enumeration list (13) – xNL-types.xsd n Party Address Schema – xAL.xsd n Supporting enumeration list (32) – xAL-types.xsd n Party Information Schema – xPIL.xsd n Supporting enumeration list (60) – xPIL-types.xsd

44 CIQ Specifications without Genericode Approach Code Lists defined in these 4 files

45 Use Party Name as Case Study

46 Code Lists defined in an XML Schema (xNL- types.xsd) that is “included” in xNL.xsd

47 Enumeration List referenced from xNL-types.xsd

48 xNL Enumeration List n Users given the choice to modify the code lists to meet their specific requirements n Basic default values provided, but it is up to the users to use them as is or customise it

49 xNL Enumeration List - Drawbacks n Each application has to have its own enumeration list n Point to point negotiations between applications n No standard enumeration list file that remains untouched n Change in enumeration list will result in change to application code generation n The Name schema might be used in multiple locations in an organisation (e.g. billing, marketing, sales, customer identification) and hence, customising the enumeration list is not straightforward n It might be an overhead for an application to use a large code list when it requires only 3 values

50 Objective of this case study n Move away from embedding code lists as XML schemas and “include” or “import” them in base XML schemas n Investigate the use of genericode approach and UMCLVV in CIQ Specifications n Implement genericode approach in CIQ Specifications as an optional feature n Customise the genericode based default code lists with specific requirements without modifying the default code lists n Apply business rule constraints on the core CIQ XML schemas without modifying the XML schemas

51 Case Study - Scenarios n Add a new code list value to default name code list (“NativePlaceName”) n Restrict the default name code list to allow no more than one first and last name (“FirstName”, “LastName”) n Restrict the default code list to allow only “FirstName”, “LastName”, and “NativePlaceName” as code values n Apply business rule constraints on XML Schema Customising the default xNL Code List without changing it to cater the above requirements is impossible

52 Preparing xNL Schema with Genericode Approach to Handle Code Lists

53 Step 1- Create default.gc files n Identify and decide on list-level and instance- level metadata to be included n Create.gc file for each enumeration list in xNL-types.xsd n Ensure that the.gc file is valid structurally against genericode-code-list.xsd file

54 .GC file - Example Code Value

55 List Level Metadata

56 Instance Level Metadata n In the absence of metadata properties for values in the instance being validated, only the values found in the associated external list representation can be used. There being no qualification of the values in the instance, all values in the external file are in play as valid values for validation n If the instance being validated does have metadata properties specified for a given value, then that value is asserted to be a value from a particular version or identified list of values. n Instance level metadata allows an instance to disambiguate a coded value that might be the same value from two different lists.

57 Step 2: Modify xNL.xsd n Remove references to enumeration list defined as xml schemas n Include distinct instance level metadata for all elements/attributes that uses code list values n Instance Level Metadata used l Ref == genericode ShortName l Ver == genericode Version l URI == genericode CanonicalUri l VerURI == genericode CanonicalVersionUri

58 Instance Level Metadata Instance level Metadata for “ElementType” attribute xs: string

59 Step 3: Prepare Context/Value Association (CVA) File n Every element and attribute information item below the document element of an XML document is in a document context described by its hierarchical ancestry of elements. A fully qualified document context specifies the information item’s precise location in the document. n Define the all the default document contexts with pointers to the default genericode files produced from xNL-types.xsd

60 CVA File

61 Step 4 - Prepare files for Value Validation n Run the supplied batch/shell files as part of the Methodology process to create the necessary files for code list value validation

62 Applying Constraints to Default Code Lists

63 Default Schema and Code List Values - Add a new code value “NativePlaceName” - Restrict the code values to have only “FirstName” and “LastName”

64 Step 1 – Add a new code list value n Add a new code list value “NativePlaceName” n Create a gc file with this code value

65 Step 2 – Restrict the default code list n Restrict the code values to only “FirstName” and “LastName” n Create a.gc file with this restriction

66 Step 3 – Create Restriction CVA File

67 Applying Business Rules to Constrain Default XML Schemas

68 Step 4 – Define Business Rules to include constraints to default schema Restrict the schema to accept only one First Name and one Last Name

69 Business Rules to define constraint No changes to xNL Schema

70 Step 4 - Prepare files for Value Validation n Run the supplied demonstration batch/shell files as part of the Methodology process to create the necessary files for value validation

71 CIQ Global Address Specification (xAL) Can be customized to specific country address structure using the Methodology, but at the same time keeping the customized structure in compliance with xAL default structure

72 Example 1: Customizing xAL for Singapore n Let us assume that Singapore Address does not require the following xAL elements: l Administrative Area l Rural Delivery, or l Post Office l Location Coordinates l Free Text Address l Country

73 Example 1: Customising xAL for Singapore

74 Example 1: Business Rule for Singapore Address No changes to xAL Schema

75 Example 2: Customizing xAL to only use Free Text Address Lines

76 Business Rule for Example 2 No changes to xAL Schema

77 CIQ Specifications with Genericode Approach

78 Skills Required to use OASIS Code List Approach n XML Schema Language n Schematron Language n XSLT (some times) n XPATH n XML Processors/XML Parsers n Batch Files / Shell Files

79 Experience using the Methdology and Genericode Approach n Powerful n The only standard for managing code lists now in industry n Manual effort (requires patience) n Painful without tool support n But once everything has been set up, works beautifully n Does not deal with mapping between schemas

80 n OASIS Codelist Representation (Genericode) Version 1.0, May 2007, http://docs.oasis- open.org/codelist/cd-genericode- 1.0/doc/oasis-code-list-representation-genericode.pdf n Schematron-based Value Validation and Genericode, Working Draft, Version 0.1, July 2007, http://www.oasis-open.org/committees/document.php?document_id=24810 http://www.oasis-open.org/committees/document.php?document_id=24810 n OASIS Code List Adaptation Case Study (OASIS CIQ), Version 0.3, July 2007, http://www.oasis- open.org/committees/document.php?document_id=24813 http://www.oasis- open.org/committees/document.php?document_id=24813 n OASIS Party Information Standards, http://www.oasis- open.org/committees/ciq References

81 Special Thanks…….. n Ken Holman, Chair, OASIS Code List Representation TC n Juerg Tschumperlin, Data Management Solutions, New Zealand

82 Thank You http://www.oasis-open.org/committees/ciq


Download ppt "Customising OASIS CIQ Specifications V3.0 to meet end user requirements – A Case Study Ram Kumar Chairman OASIS CIQ Technical Committee Ram Kumar Chairman."

Similar presentations


Ads by Google