Modules, Profiles, and Application Schemas1
John I. Bobbitt
POSC
License Agreement: © 2004, Petrotechnical Open Standards Consortium, Inc. All rights reserved. All access, receipt, and/or use of this document is subject to the POSC Product Licensing Agreement posted on the POSC Web site at http://www.posc.org/about/license.shtml.
Abstract: POSC is developing modules, tables, and components that can be used to build application schemas. The use of these standard schemas is encouraged in order to enhance interoperability. To use a module in an application schema generally requires some modification. This paper outlines the process of building an application schema by collecting together a set of modules. It also formalizes the concept of a profile, which makes the module fit for a particular purpose.
A module is the basic building block of an XML application schema. In simple terms, a module is a business object expressed in XML schema language [IntroModule].
For example, a module might be casing description, a well header, a business associate, a well test. Each of these has a business meaning. But any one of them alone does not constitute an exchange set. For example, a well test can be a full description of the test, along with its data. But it also needs the information about the well that it was conducted in. It might also need information about the business associates that conducted the tests and did the analysis.
Thus a module is not a complete schema ready for an application. A particular use would gather together a set of modules in order to constitute an application schema. An application schema can be thought of as a collection of modules (and possibly additional elements) that satisfy a business need. For example, a well test report might consist of well header, business associate, and the well test module. Collecting these together, possibly with some additional information, would constitute an application schema.
The other important property of a module is that it is interchangeable. When building an application schema, a user community may decide that the business associate module supplied by POSC is less fit for use than one that they have developed, or one that they have found elsewhere. The development of an application schema should allow the user community to use a module independent of its source. That is, they need not obtain all of their modules from the same schema developer.
Finally, the development of an application generally requires a chosen module to be profiled [ModulePolicies]. A properly developed module will usually be too general for any specific usage scenario. The user community will apply criteria to the module to make it usable for their purpose. For example, a regulatory body in the US would specify that the well name Identifier should be an API number, and the naming system will be ‘API’. A group in the UK North Sea specifying an offshore location would use a schema block developed specifically for offshore UK, rather than a schema block for the offshore US.
This document will cover the rules for development of a profile of a POSC module.
Module:
A business object, expressed in XML schema. A module may stand alone and have semantic meaning. A module has an identity [IntroModule][BuildModule].
Table:
A collection of tightly connected information, that does not have a natural identity. A table represents a property of a business object [XMLTables].
Original Schema:
The schema as originally developed by the developer, and emplaced for use. The original schema is intended to be imported and profiled for final use.
Application schema:
A collection of modules and possibly other schema elements that constitute a final document for a specific purpose. An application schema is developed and used by a particular user community. The modules are profiled for the particular application.
User Community:
A group that has a specific business purpose, and develops and uses an application schema to support that business purpose. A user community will have a particular usage case, and will develop an application schema to support that usage case.
XML Instance:
An XML document that contains data. An instance may also refer to a particular element or module.
Example: A schema may have an element, ‘Name’, of type string. An instance would be
<Name>Joe Bleaux</Name>
Restriction:
A limitation on an XML element. The data type allows a particular set of values to be used for an element. A restriction allows only a subset of those values. A value that is valid in the restriction must also be valid in the original schema.
Example: an element may allow a positive integer value. A restriction may restrict it to a value less than 90.
Extension:
An increase in the set of values for an element. An extended value is not in the original schema. However, it may be allowed for in the original schema.
Example: An extended complex datatype may be developed that allows additional elements to be added.
Subset:
A subset of a schema (in particular, of a module) is a set of restrictions on the module. Any XML instance that is valid in the subsetted schema is also valid in the original schema.
Profile:
The rules for using a schema. In this document, a profile will generally be applied to a module or a table. A profile is a set of restrictions and extensions that are applied by a user community when developing an application schema [ModulePolicies].
Home namespace:
The module is developed by a particular group, and is maintained by that group. The module should be encapsulated within a namespace as designed by the developer. The home namespace refers to the namespace given by the developer of the module.
User namespace:
The user community develops an application schema by combining modules from various developers in various namespaces. The user community defines its elements within its own namespace, which is the user namespace.
Foreign namespace:
The namespace of a module from a different namespace that is inserted into a module.
Example, a module is developed by developer, A, and is given a namespace of ‘A’. One of the elements of the module is an ‘Operator’, which is itself a module. A user community builds its application schema in a namespace, ‘B’. When using the module from A, the community decides to use a module from a third party to specify the ‘Operator’ that is in the original module. The user community would specify a foreign namespace, ‘C’, for this inserted module.
Import:
A set of schema is brought into another schema using a schema ‘import’ element. A module that is imported is brought in with its own namespace.
Include:
A set of schema is included in another schema using a schema ‘include’ element. A module that is included in a parent schema has the same namespace as the parent.
The following example will be used to illustrate the rules for profiling and developing application schemas.
Developer A has a module (A:WellHeader.xsd) available for use.
WellHeader.xsd has a “plug” called _WellAssociate, which allows the use of a business associate module. It is intended that the concepts of ‘Operator’, ‘RoyaltyOwner’, ‘Driller’, and any other appropriate business associates can be used in this spot.
Developer A has a business associate module (A:BusAssoc.xsd) available for use.
A user community, B, wants to develop an application schema. They have their own business associate module, (B:BAssociate.xsd) available for use. When developing their application schema, they also need a business associate module to specify a ‘Contact’ for the XML instance document.
There is also a module developed by C (C:BA) which may serve as a business associate module.
Sample schemas are shown in Appendix A.
A textual statement is merely a statement in a document that defines a particular usage. This statement will be a more restrictive usage, and shall not be a redefinition of an intended usage2. The textual statements form part of the user community profile.
In addition, the schema may be modified to enforce a profile. Guidelines for valid modifications are the following:
Any optional element/attribute may be made mandatory.
Any element may be reduced in multiplicity. For example, an element with maxOccurs=”unbounded” my be restricted to maxOccurs=”1”.
Any enumerated list may be restricted to fewer choices.
A simple data type may be restricted to a smaller domain.
Any other change not listed which is a strict subsetting may also be valid.
The guidelines for determining if a schema is a subset would be that any XML that is valid for the subsetted schema is also valid for the original schema.
The schema for use by a user community may be copied from the original developer and physically edited to reflect the subsetting. However, the original developer shall remain the owner of the module.
Whether the original module is copied to another location, or used directly, it shall be imported into the application schema for use, and the original namespace shall be used as appropriate. The application schema must use a different namespace for additional modules and elements that are incorporated from other namespaces3.
The extension methods are of three types: (1) Use of different modules or schemas where allowed, (2) extensions of enumerated lists or insertion of the user community list, and (3) some use of schema methods that add additional elements and/or attributes, where the addition is not a replacement for an already defined element/ attribute4.
Extensions are carried out at well-defined, documented places, using standard XML schema methods. In general, extensions occur where a global element is defined to be abstract, or where a type is defined to be abstract, with the expectation that a user group will define the concrete type. There are specific schema methods and patterns for carrying out such extensions.
The redefine schema method must not be used.
Enumerated lists are also extended by standard methods, which will be documented with the modules. Any extensions of enumerated lists that cannot be handled by these documented methods shall be considered to be changes to the modules rather than valid extensions.
Extensions generally require a schema be developed. The schema shall import the outside modules with the appropriate namespaces. Extensions shall be in another namespace. This insures that an XML reader can understand which portions or the application schema come from the imported modules, and which are the extended portions.
The examples and explanations will be broken into three subsections. The first deals with subsetting. The second section will cover extensions to the schema that introduce new and/or different elements and attributes. The third section covers extensions to enumerated lists.
Refer to the simple module in Section A.1 for the examples.
The subsetting may be strictly done with text statements. Examples would be:
The WellName/Name shall always be the 12 digit API number
The WellName/NamingSystem shall be mandatory, and shall always be ‘API’
One, and only one, Operator shall be given.
The RoyaltyOwner shall be included if the WellPurpose is ‘exploratory.’
The subsetting may also be captured in the schema file. Note, though, that not every text restriction can be captured in a schema file. For example, d) cannot be captured, because its restriction depends on a value of an element, rather than on the existence and structure of an element.
The schema file shall be modified by copying it into developer B’s work area. However, it should not be modified in any way that invalidates the rules of subsetting (for example, don’t change WellName/Name to WellName/APINumber).
An example of the schema in A.1 that has been modified is shown in A.2. The modifications have been imposed to enforce the rules in b) and c). Note that the schema can also be modified by invoking a pattern for WellName/Name so that a) can be enforced. However, not all restrictions need to be incorporated, as long as the text document is considered part of the profile.
The new document retains the same namespace as the original namespace, and is to be imported into an application schema.
Enumerated Lists:
An enumerated list MAY be restricted in its set of values. It may be done textually. For example, a list of valid Geodetic Datums may include the 228 datums in the EPSG datum list. However, a UK group may only allow the use of 7 of them appropriate to the UK North Sea area. A valid restriction would be a statement in a text file that list the 7 that are allowed.
The group may also do the restriction using XML schema. This type of restriction may be performed by deleting many of the enumerations in the file (editing the file) defining the enumerated list. This method would lead to a subset, since an XML instance valid for the restricted list is also valid for the original list.
Another method is to define a new list using the schema restriction method, and using substitution groups to allow the use of the restricted set. This method, although restricting the list, will not be a subsetting operation, since it introduces a new element. Furthermore, it may only be done if the original schema allows it. Hence, it will be covered in section 4.2.3 which deals with enumerated lists.
A structural extension SHALL NOT be used when profiling unless there is a clear statement that such an extension is allowed. In POSC modules, these are indicated by abstract element (implying the use of substitution groups) and abstract types (implying the use of extended types, with xsi:type appearing in the XML instance).
The example will use the schema in Appendix A.1, which has an abstract element, ‘_WellAssociate’. The schema also develops a non-abstract element, ‘WellAssociates’, which may be used in instance documents. The XML instance example shows how this is used.
Assume that group B, when developing an application schema, would like to use the their own version of a business associate (BAssociateType, contained in a file, BAssociate.xsd.) The steps are as follows (see Appendix A.3 for the full schema).
In the application schema, import the POSC Well Header module file (A namespace).
Include the BAssociate.xsd file (B namespace).
If desired, import other “business associate modules” from other namespaces.
Follow the pattern shown in A.1 to develop a WellAssociatesContainerType (choose your own name) and an element (for example, MyWellAssociates) that contains the modules desired. It must be an extension of A:abstractBasicType, or whatever the abstract element type is.
Define an element of this extended type, and make it part of the A:_WellAssociate substitution group.
The process is similar for extending an abstract type, and using the xsi:type construct.
The construction adds an additional choice to the _WellAssociates location in the schema. At this point, both the A:WellAssociates, and the newly developed B:MyWellAssociates, containers would be possible. If the developer group wants to specifically require the use of the B:MyWellAssociates they can legally profile the schema by removing the A:WellAssociates definition in the original (copied) file. Such a replacement is a valid extension.
Extensions of enumerated lists can come in one of three forms: choose a list, add values to a list, and refine values [ReferenceData]. Each method must be accomodated in the original schema to be valid (explained below). If the accomodation is not in the original schema, the extended set of values is not a valid extension under the profiling rules.
Appendix A.4 shows a sample schema that illustrates the accomodations that allow a list to be extendable or refined. Appendix A.5 shows a sample of how an alternate list may be accomodated, and how an alternate list may be chosen.
The first accomodation is the “extendable by Other: “ accomodation. Appendix A.4 shows the construction of such a type. This allows a user at XML write time to add a value that is not in the enumerated list. The value is flagged by the six characters plus space: ‘Other: ‘ that precedes the new value. The profile should give the additional values and their meanings so that readers may understand them.
The second accomodation is the refinement attribute, and is denoted by a “refineable list.” This allows a user to give more meaning to the value, with the addition of a single layer of refinement.
Both the “extendable by Other: “ and the “refineable” extensions are accomodated in a text profile. There is no change to the schema.
The third accomodation is the alternate list. Appendix A.5 shows how the original schema is built that allows an alternate list to be used. It then shows how an application schema would be used to incorporate an alternate list. The method only applies to elements that are defined as global elements, and are thus amenable to substitution groups. In addition, the POSC implementations will make the head element of the substitution group abstract so that it will be more easily recognized.
The application schema is generally developed as a collection of profiled modules. The application should be developed in its own namespace, and each of the modules should be imported, maintaining their original namespace.
POSC will develop modules as complexTypes. The application schema is free to use the complexType with its own elements defined in the user namespace. POSC will generally include a global element of the type, and this element can be used rather than using a user defined element. The difference in the two choices is simply the resulting namespace for this element.
When appropriate, a table may be used as a property. POSC will develop tables, which will also have a namespace assigned. Application schemas that incorporate a table will be expected to import the schema and maintain the namespace.
POSC will develop assemblies and blocks of schema that are below the module level, but are useful on their own. Namespace declarations will be left off of these constructs6 with the expectation that they may be used in many modules, and the user community may want these blocks to take on the user namespace. Such use is permitted. However, it is expected that the schemas that include these blocks will credit POSC for their usage. These assemblies and blocks should not be altered beyond the standard profiling rules given in this paper. Any alterations should be made in consultation with POSC.
[ANSIX12] X12 Reference Model for XML Design, 2002-10, produced by the ANSI X12 committee, obtainable at http://www.x12.org/x12org/.
[BestPractices] Best Practices Homepage, developed and maintained by XML-dev and Mitre, obtainable at http://www.xfront.com/BestPracticesHomepage.html.
[ComProServ] PIDX XML Standards Master, Version 1.0, RP 3901, produced by PIDX, obtainable at http://committees.api.org/business/pidx/standards.htm.
[EBCCNAM] ebXML RT - Naming Convention for Core Component, 2001-05-10, produced by the ebXML group, obtainable at http://www.ebxml.org/specs/index.htm#technical_reports.
[ebTechArch] ebXML Technical Architecture Specification V1.0.4, 2001-02-16, produced by the ebXML group, obtainable at http://www.ebxml.org/specs/ebTA.pdf.
[FedDevGuide] Draft Federal XML Developer's Guide, 2002-04 (work in progress), produced by the Federal CIO Council, obtainable at http://xml.gov/documents/in_progress/developersguide.pdf.
[FedTagStds] Federal Tag Standards for Extensible Markup Language, 2001-06, produced by LMI, not obtainable from the internet.
[HKGuide] XML Schema Design and Management Guide, (4 parts), Draft versions dated in summer, 2003. Produced by Hong Kong Information Services Technology Division. Available at http://www.itsd.gov.hk/itsd/english/infra/eif.htm.
[IETFKeywords] Key Words for Use in RFCs to Indicate Requirement Level, 1997-03, obtainable at http://www.ietf.org/rfc/rfc2119.txt.
[ISO8601] International Standard Date and Time, 2001-11-10, produced by ISO. A web page that explains the formats is http://www.cl.cam.ac.uk/~mgk25/iso-time.html.
[ISO11179] ISO 11179 Part 5 - Naming and Identification, 1995-12, produced by ISO, obtainable at http://fdr.faa.gov/iso/ISO11179page.htm. There is a later version, that is available from the ISO website,
[UKGuide] e-Government Schema Guidelines for XML, 2002-12, produced by United Kingdom e-Envoy, obtainable at http://www.e-envoy.gov.uk/Resources/Guidelines/fs/en.
[Unicode] Unicode Charts, available at http://www.unicode.org/charts/.
[W3CSchemaDatatypes] W3C Schema Datatypes, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-2.
[W3CNamespaces] Namespaces in XML, 1999-01-14, produced by W3C, obtainable at http://www.w3.org/TR/REC-xml-names/.
[W3CSchemaPrimer] W3C Schema Primer, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-0.
[W3CSchemaStructures] W3C Schema Structures, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-1.
[Xlink] W3C XLink Specification, 2001-06, produced by W3C, obtainable at http://www.w3.org/TR/xlink/.
[Xpath] W3C XPath Specification, 1999, produced by W3C, obtainable at http://www.w3.org/TR/xpath/.
[XSL] W3C XSL and XSLT Specifications, produced by W3C, obtainable at http://www.w3.org/Style/XSL/.
POSC references are available in the following formats:
[html] html format readable by browsers
[doc] MS Word 97/2000/XP
[sxw] OpenOffice writer, v1.0
[IntroModule] Introduction to Modules, Copyright 2002-2003. Available in [html], [doc], [sxw].
[BuildModule] Build a Module - a tutorial. Copyright 2003. Available in [html], [doc], [sxw].
[ImportModule] Importing Modules within your Modules. Copyright 2003. Available in [html], [doc], [sxw].
[Guidelines] Guidelines for XML Schemas, Version 2003. Copyright 2003. Available in [html], [doc], [sxw].
[ModulePolicies] Policies on Modules. Copyright 2002-2003. Available in [html], [doc], [sxw].
[ProfilesAppSchema] Modules, Profiles, and Application Schemas. Copyright 2002-2003. Available in [html], [doc], [sxw].
[XMLTables] XML Tables. Copyright 2003. Available in [html], [doc], [sxw].
[ReferenceData] Reference Data and Enumerated Lists Implemented in XML. Copyright 2002-2003. Available in [html], [doc], [sxw].
[Dictionaries] Examples of XML Dictionary Usage. Copyright 2003. Available in [html], [doc], [sxw]. Accompanied by sample code.
[Relationships] Relationships in XML. Copyright 2003. Available in [html], [doc], [sxw].
[UOMRecs] Unit of Measure Recommendations. Copyright 2002-2003. Available in [html], [doc], [sxw].
Several schemas will be shown in different namespaces. Various parts will be used to illustrate the various needs for profiling.
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<include schemaLocation=”wellPurpose.xsd”/>
<include schemaLocation=”BusAssoc.xsd”/>
<complexType name=”wellHeaderType”>
<sequence>
<element name=”WellName” type=”A:wellNameType”/>
<element ref=”_WellAssociate” minOccurs=”0”
maxOccurs=”unbounded”/>
<element name=”WellPurpose” type=”A:extWellPurposeEnum”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
<attribute name=”id” type=”string”/>
</complexType>
<!-- definition of the WellName. -->
<complexType name=”wellNameType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”NamingSystem” type=”string” minOccurs=”0”/>
<element name=”Version” type=”string” minOccurs=”0”/>
</sequence>
</complexType>
<!-- definition of the _WellAssociate element. This will be an
abstract element to be used as a substitution group.
It will be of type abstractBasicType, which is a complex type
with no content -->
<complexType name=”abstractBasicType”>
<sequence/>
</complexType>
<element name=”_WellAssociate” type=”A:abstractBasicType”
abstract=”true”/>
<!-- Add a nonabstract element to the _WellAssociate place.
Remember that the BusAssoc.xsd file was ‘included’.
It contains a busAssocType -->
<element name=”WellAssociates” type=”A:wellAssociatesType”
substitutionGroup=”A:_WellAssociate”/>
<complexType name=”wellAssociatesType”>
<complexContent>
<extension base=”A:abstractBasicType”>
<sequence>
<element name=”Operator” type=”A:busAssocType”/>
<element name=”RoyaltyOwner” type=”A:busAssocType”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
</extension>
</complexContent>
</complexType>
<!-- The extWellPurposeEnum will be defined in another section.
It is in the include file -->
</schema>
An example of an XML instance for this might be
<B:SomeRootElement
xmlns:A=”http://www.A.com”
xmlns:B=http://www.B.com”/>
<B:WellInfo id=”wella”>
<A:WellName>
<A:Name>420130013601</A:Name>
<A:NamingSystem>API</A:NamingSystem>
</A:WellName>
<A:WellAssociates>
<A:Operator>
.. .. some information here .. ..
</A:Operator>
</A:WellAssociates>
<A:WellPurpose>production</A:WellPurpose>
<B:WellInfo>
. . . other modules
</B:SomeRootElement>
The file is copied from the POSC area, and stored locally in poscWHead.xsd. It is modified as follows:
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<!-- Also copy wellPurpose.xsd and BusAssoc.xsd, and any files
included by these two. -->
<include schemaLocation=”wellPurpose.xsd”/>
<include schemaLocation=”BusAssoc.xsd”/>
<!-- Modify the ref=”_WellAssociate” to contain a single instance -->
<complexType name=”wellHeaderType”>
<sequence>
<element name=”WellName” type=”A:wellNameType”/>
<element ref=”_WellAssociate”/>
<element name=”WellPurpose” type=”A:extWellPurposeEnum”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
<attribute name=”id” type=”string”/>
</complexType>
<!-- definition of the WellName. -->
<!-- Modify the WellName to make NamingSystem mandatory and fixed -->
<complexType name=”wellNameType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”NamingSystem” type=”string” fixed=”API”/>
<element name=”Version” type=”string” minOccurs=”0”/>
</sequence>
</complexType>
<!-- definition of the _WellAssociate element. This will be an
abstract element to be used as a substitution group.
It will be of type abstractBasicType, which is a complex type
with no content -->
<complexType name=”abstractBasicType”>
<sequence/>
</complexType>
<element name=”_WellAssociate” type=”A:abstractBasicType”
abstract=”true”/>
<!-- Add a nonabstract element to the _WellAssociate place.
Remember that the BusAssoc.xsd file was ‘included’.
It contains a busAssocType -->
<element name=”WellAssociates” type=”A:wellAssociatesType”
substitutionGroup=”A:_WellAssociate”/>
<complexType name=”wellAssociatesType”>
<complexContent>
<extension base=“A:abstractBasicType”>
<sequence>
<element name=”Operator” type=”A:busAssocType”/>
<element name=”RoyaltyOwner” type=”A:busAssocType”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
</extension>
</complexContent>
</complexType>
<!-- The extWellPurposeEnum will be defined in another section.
It is in the include file -->
</schema><schema targetNamespace=”http://www.B.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns:B=”http://www.B.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<import namespace=”http://www.A.com”
schemaLocation=”poscWHead.xsd”/>
<include schemaLocation=”BAssociate.xsd”/>
<complexType name=”myOwnWellAssociatesType”>
<complexContent>
<extension base=”A:abstractBasicType”>
<sequence>
<element name=”Operator” type=”B:BAssociateType”/>
<element name=”Driller” type=”B:BAssociateType”
minOccurs=”0”/>
<element name=”Owner” type=”B:BAssociateType”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name=”MyWellAssociates” type=”B:myOwnWellAssociatesType”
substitutionGroup=”A:_WellAssociate”/>
<.. followed by schema that develops the application schema..
</schema>
Note that this extends, not replaces, the A:WellAssociates container. In actuality, both can be present. However, the text profile would generally specify that only one should be used.
The following shows the schema for an enumerated list (petType). It will allow the values, {dog, cat, bird} to be given. A sample XML instance would be:
<A:Pet>dog</A:Pet>
The schema shows the construct for a list that allows “extension by Other: “. It is formed as a union of the petType, and a pattern: ‘Other: xxxxx’ where the xxxxx can be any set of 2 or more characters. The sample is extPetType, and a sample XML instance would be:
<A:Pet>Other: fish</A:Pet>
The schema shows a construct for a list that is refinable. An attribute, refinement, is added which allows the users to put in a refinement for a particular value. The sample is refinedPetType, and a sample XML instance would be:
<A:Pet refinement=”irish setter”>dog</A:Pet>
The schema shows a construct for a list that is both extendable by Other: and refineable. The sample is refinedExtPetType, and a sample XML would be:
<A:Pet refinement=”hampster”>Other: rodent</A:Pet>
The schema is as follows:
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<simpleType name="petType">
<restriction base="string">
<enumeration value="dog"/>
<enumeration value="cat"/>
<enumeration value="bird"/>
</restriction>
</simpleType>
<simpleType name="otherNameType">
<restriction base="string">
<pattern value="Other: \w{2,}"/>
</restriction>
</simpleType>
<simpleType name="extPetType">
<union memberTypes="A:otherNameType A:petType"/>
</simpleType>
<complexType name="refinedPetType">
<simpleContent>
<extension base="A:petType">
<attribute name="refinement" type="string" use="optional"/>
</extension>
</simpleContent>
</complexType>
<complexType name="refinedExtPetType">
<simpleContent>
<extension base="A:extPetType">
<attribute name="refinement" type="string" use="optional"/>
</extension>
</simpleContent>
</complexType>
</schema>
The original schema is defined with an abstract, global element of type string. Generally, a single list is also provided that is nonabstract.
Note that the abstract, global element is the indication that a substitution list may be used.
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<!-- For completeness, embed the element in a simple structure -->
<element name=”PetRegistration” type=”petRegistrType”/>
<complexType name=”petRegistrType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”Age” type=”A:QuantityType”/>
<element ref=”A:_Pet”/>
</sequence>
</complexType>
<!-- Now define the abstract type that can be used for
multiple lists -->
<element name=”_Pet” type=”string” abstract=”true”/>
<!-- Put in a single list, and make an element of this type -->
<simpleType name="petType">
<restriction base="string">
<enumeration value="dog"/>
<enumeration value="cat"/>
<enumeration value="bird"/>
</restriction>
</simpleType>
<element name=”Type” type=”petType” substitutionGroup=”A:_Pet”/>
</schema>
Here is a sample XML file:
<A:PetRegistration> <A:Name>Juniper</A:Name> <A:Age uom=”yr”>8</A:Age> <A:Type>cat</A:Type> </A:PetRegistration>
The developer group B that uses this module, wishes to replace the {dog, cat, bird} list with its own: {canine, feline, avian, rodent}. Here is a snippet of schema that would do that:
<schema targetNamespace=”http://www.B.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns:B=”http://www.B.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<import namespace=”http://www.A.com”
schemaLocation=”petRegister.xsd”/>
<simpleType name="myPetType">
<restriction base="string">
<enumeration value="canine"/><enumeration value="feline"/>
<enumeration value="avian"/>
<enumeration value="rodent"/>
</restriction>
</simpleType>
<element name=”Type” type=”B:myPetType” substitutionGroup=”A:_Pet”/>
</schema>
A sample XML instance would be:
<A:PetRegistration> <A:Name>Juniper</A:Name> <A:Age uom=”yr”>8</A:Age> <B:Type>feline</B:Type> </A:PetRegistration>
Note that ‘Type’ is preceded with the B: namespace, which allows the reader to know that an alternate list is being used.
Many simple types use unrestricted strings as their values. It is often the case that a profile would like to introduce a restriction into the value. There are two basic ways to do this, which will result in different XML instances being produced. It is important that the profiling group understand these two ways, and the ramifications of each on interoperability.
The first way (replacement) will maintain interoperability by leaving the element name the same, and will not introduce any new attributes. The second way (substitution) alters the element by introducing the xsi:type attribute. This way may not be usable for all cases.
Note that Appendix A5 covers the special case in which a global element is defined for which a substitution group replacement may be made. The case considered here is when the element is not global, which means that a substition cannot be performed.
We can alter the basic schema from appendix A5 to be as follows:
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<!-- For completeness, embed the element in a simple structure -->
<element name=”PetRegistration” type=”petRegistrType”/>
<complexType name=”petRegistrType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”Age” type=”A:QuantityType”/>
<element name=”Pet” type=”string”/>
</sequence>
</complexType>
</schema>
The goal is to restrict the values in Pet to an enumerated list (or pattern).
Method 1: Replacement
The schema is copied into a directory at some convenient place (by the profiling group) and the schema is physically changed. I.e., it could be changed to appear as
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<!-- For completeness, embed the element in a simple structure -->
<element name=”PetRegistration” type=”petRegistrType”/>
<complexType name=”petRegistrType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”Age” type=”A:QuantityType”/>
<element name=”Pet” type=”A:aRestrictedPetType”/>
</sequence>
</complexType>
<simpleType name="aRestrictedPetType">
<restriction base="string">
<enumeration value="dog"/>
<enumeration value="cat"/>
<enumeration value="bird"/>
</restriction>
</simpleType>
</schema>
The advantage in this method is that a conforming XML instance document will be exactly a subset of the full schema. The disadvantage is that any changes to the main schema (for example, later versions) will need a repetition of these replacements in the profiled version. Thus, the profiling group must keep records of its changes.
A variation of this replacement method is to perform the restriction in text only. Rather than change the schema, the profiling group will state in an implementation document that the values are restricted to {dog, cat, bird}.
Method 2: Type substitution.
The second method is to develop a simpleType as above, and use the xsi:type to do the substitution. Note that this can be done without altering the basic schema, and may be done in a different namespace. (This example will put it in the basic schema namespace.)
<schema targetNamespace=”http://www.A.com”
elementFormDefault=”qualified”
xmlns:A=”http://www.A.com”
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<!-- For completeness, embed the element in a simple structure -->
<element name=”PetRegistration” type=”petRegistrType”/>
<complexType name=”petRegistrType”>
<sequence>
<element name=”Name” type=”string”/>
<element name=”Age” type=”A:QuantityType”/>
<element name=”Pet” type=”string”/>
</sequence>
</complexType>
<simpleType name="OurPetType">
<restriction base="string">
<enumeration value="dog"/>
<enumeration value="cat"/>
<enumeration value="bird"/>
</restriction>
</simpleType>
</schema>
The XML instance will look like the following:
<A:PetRegistration> <A:Name>Juniper</A:Name> <A:Age uom=”yr”>8</A:Age> <A:Type xsi:type=”OurPetType”>cat</A:Type> </A:PetRegistration>
Note that the “type” that is used appears in the XML schema as a value of the element, xsi:type. Thus, the structure of the output is changed (slightly) and the instance is not strictly a subset of the main schema.
The rules for doing this type of substitution are that new type must either be the same type as the original type, or a restriction of the original type. In the example above, the original type was “string”, and the OurPetType was a restriction of the string.
Discussion:
POSC profiling rules will accept either replacement or type substitution as valid profiles. However, we prefer the replacement method, since the XML instances are a direct subset. However, this puts a burden on the profiling group to keep track of these changes in order to apply them to later versions, if desired.
1 © 2004, Petrotechnical Open Standards Consortium, Inc. All rights reserved. All access, receipt, and/or use of this document is subject to the POSC Product Licensing Agreement posted on the POSC Web site at http://www.posc.org/about/license.shtml.
2 An example of a redefinition would be the following statement for the usage of the Name, NamingSystem, and Version elements of an Identifier. “When giving the identifier of a person, the Name shall be the surname, the NamingSystem shall be the first name, and the Version shall be the middle initial.” Such a statement is clearly a misuse of the original intention.
3 If more than one module is imported from the POSC namespace, they may take on the same namespace after import. Since only one file from a particular namespace can be imported, the application schema would need to combine the several files into a single one –either through an include or by physically combining them.
4 An example of a prohibited extension of the third type would occur in the WellName/Identifier/Name element. The structure should not be extended to carry an element, APINumber, which is covered by the Name, NamingSystem pair.
5 The guidelines given in this section are intended to enhance interoperability. Interoperability is served to the extent that standard modules, blocks, and assemblies maintain their structure. The prescriptive language is intended to prevent arbitrary and capricious changes from occurring.
6 Version 1.0 and 2.0 of the POSC modules have target namespaces assigned throughout the schema. POSC is changing its versioning policy, and will follow the prescription mentioned above that leaves the namespaces off the components, blocks, and assemblies.
2002-05-12 Modules, Profiles, and Application Schemas Page