Build a Module1
John I. Bobbitt
POSC
License Agreement: © 2004, Petrotechnical Open Standards Consortium, Inc. All rights reserved. All access, receipt, and/or use of this document is subject to the POSC Product Licensing Agreement posted on the POSC Web site at http://www.posc.org/about/license.shtml.
Abstract: An XML module is an encapsulated business object, implemented in XML schema. If properly built, it can offer the flexibility needed to be useful in other applications. This paper will go through the steps of building a module for general use.
A business object, such as a well, a field, or a well log, can be implemented in XML schema to serve as a module for application schemas. The basic idea is to build a general module that can serve many purposes. A community of users that select the module for their particular application will profile the module to fit their needs.
This document will go through a set of steps to build a BINSET module. In order to keep things simple for this example, the Binset will be only the case of a 3D survey in which the bin nodes are regularly spaced in both directions, and the total binset is a rectangle.
In November 1999, UKOOA developed its P6/98 specification for the definition of a 3D Seismic Binning Grid. This publication will be used to specify the information content of the binset module.
A binset is a regular set of bin nodes arrayed in a rectangular pattern. A Binset is a geometric description of these bins. No attempt will be made to record properties of the bins (such as stack fold of a bin).
To describe this set, there will be two coordinate reference systems (CRS).
The first CRS, the map grid, will be a map projection system, referenced to a geodetic system. An example of such a system would be UTM 31N, based on WGS 84. The description of a binset would need a clear definition of this map grid.
The second CRS, the bin grid, will describe how the rectangular set of bin nodes is arrayed in the map system. Among the information to be captured to describe the grid is the range of I-axis points, and J-axis points (these are generally integers, and are often known as inline and crossline indexes). The spacing between these points should be given. In order to tie these index locations to the map grid, the direction of the axes, with respect to the map grid north, should be known. Also, one point (the bin grid origin) must be given in the map grid coordinates. The detailed attributes to handle this description will be given later.
There are distortions that are present in the two grids. These are caused by the curvature of the earth. One distortion, for example, is the map scale factor for the projection. In particular, a metre, at the center of the projection is shorter than the ground distance by a factor of .9996 (a parameter of the projection). This value changes as you move away from the projection center. The relationship of this distortion is discussed in the P6/98 paper, and a single parameter is given.
Since this is a module, it will have an identifier. The identifier will probably be related to the 3D survey that gathered the data, but it does not have to be. Note that a binset can result from the mixing of data from more than one survey (2D and 3D).
In order to tie it to the acquisition, the binset will be related to one or more acquistion surveys. An acquisition survey is a module, itself, and we will need to specify how the binset relates to this other module.
No details will be given of the acquisition. I.e., the survey company, dates, etc. will not be part of the binset. This is because that information is contained in the acquisition survey module, and should not be repeated here.
There will be a portion that describes the map grid, and another portion that defines the bin grid.
The relationship between the two grids will be given in the module. This is really a set of information that defines the transformation between the two.
There are three main areas where module development goes astray. When developing the view of the business object, be sure that you remember these cautions.
It is often the case that a module is developed to support some kind of event about the business object. For example, the scenario may be the submission of a completion report for a well. You must be careful that you keep the well information separate from the completion information. You should always ask yourself questions like, “Is the well completion date really a part of the basic well information?” Separate the event or report information from the business object.
Information creep is another problem. As more people get involved with the development, they will start putting in more and more information as part of the module. For example, you can question the inclusion of the acquisition survey in the binset. Another item that some wanted to add was the ownership of the binset data. Another group wanted to put a security class on each item of data.
Don't redo modules within modules. In P6/98, there is a full set of information about the geodetic datum, including the ellipsoid parameters and the prime meridian. This information, while important to the report, is not part of the binset module. Rather than include all these details, we have defined the geodetic datum to be another module, and have separated it out as such. Keep the modular bundles separate.
This is not a step. These are simply some issues dealing with schema.
For simplicity, the schema tags will have a null namespace, and the types, etc. that are developed will all be from the same POSC namespace, and will be prefixed by a p:. Also, the main schema element will not be shown. Nor will the includes and imports. Thus, there will be simple, schema snippets.
An XML module is a complex type. The element that uses the type will have a semantic meaning attached to the module. For example, we can define a business associate module by defining a businessAssociateType. Then we can have
<element name=”Owner” type=”p:businessAssociateType”/> <element name=”Operator” type=”p:businessAssociateType”/> <element name=”ServiceCompany” type=”p:businessAssociateType”/>
Note that all three of the above are the business associate module. They differ by the semantic meaning given to the element name.
When designing a module, you must decide what is optional and what is mandatory. In addition, you need to consider multiplicity. The safest design is to make every element have minOccurs=”0” and maxOccurs=”unbounded”. This allows the profile to restrict these elements to whatever is appropriate. However, certain things (such as the Identifier) may be made mandatory, with only one occurrence, because of the semantics of an Identifier.
Another issue is extensions and replacements. This is true of both elements (which may be substituted for), and enumerated lists. There are best practice schema constructs that allow various means of replacements and extensions, but they must be planned for in the design. For this paper, it will only be noted where these should occur.
Here are some examples of extensions and replacements.
A schema has PostalCode which is an unrestricted string. We want to allow a replacement with postal code patterns that are specific to a country. For example, USZipCode, which would be a pattern of 5 or 9 digits.
An enumerated list has a set of values. We want to give the writer of the XML file a way to add additional values.
An enumerated list has a set of values. We want to allow the schema developer to replace these with another set of values.
We have a “relationship” to another module. We want the schema developer to be able to choose which module he would like to use.
A final issue is to decide what is strictly the name of something, and what is (possibly) a reference to another business object. For example, is the element, GeodeticDatum, intended to be a name only? Or might it be a reference to somewhere where we can get a full description of the geodetic datum?
The second step for designing the module is to give an XML instance (or several examples). Here is an example:
<Binset id=”binset1”>
<Identifier>
<Name>Jungular JeeCo2002</Name>
<NamingSystem>Company Internal</NamingSystem>
</Identifier>
<!-- also have an Alias that allows alternate names -->
<GeodeticDatum>ED87</GeodeticDatum>
<MapProjection>UTM 31N</MapProjection>
<BinGrid>
<BinGridOrigin>
<BinGridCoordinates>
<I>100</I>
<J>100</J>
</BinGridCoordinates>
<MapGridCoordinates>
<Easting uom=”m”>12345</Easting>
<Northing uom=”m”>987654</Northing>
</MapGridCoordinates>
</BinGridOrigin>
<ScaleReference refI=”100” refJ=”100”>1.00</ScaleReference>
<NominalBinWidth axis=”I” uom=”m”>12.5</NominalBinWidth>
<NominalBinWidth axis=”J” uom=”m”>22.0</NominalBinWidth>
<GridBearing uom=”deg”>-17.3</GridBearing>
<RotationToIAxis>clockwise</RotationToIAxis>
<BinNodeIncrement axis=”I”>1</BinNodeIncrement>
<BinNodeIncrement axis=”J”>1</BinNodeIncrement>
<CheckPoint>
<BinNode>
<I>120</I>
<J>100</J>
</BinNode>
<MapGridLocation>
<Easting uom=”m”>12300</Easting>
<Northing uom=”m”>987800</Northing>
</MapGridLocation>
</CheckPoint>
<!-- repeat for other check points -->
</BinGrid>
<Comment>This is an example of a bin grid for design</Comment>
</Binset>
The main purpose of the example is to see what an instance will look like. This gives the business group a way to analyze the module, to see what is missing, to see what is awkward, etc. It also highlights choices that can be made. For example, instead of NominalBinWidth with an attribute, axis, they may prefer to see those two lines as
<IDirectionNominalWidth uom=”m”>12.5</IDirectionNominalWidth> <JDirectionNominalWidth uom=”m”>22.0</JDirectionNominalWidth>
Having the example in front of them makes it easier to see the choices.
Here is a schema that supports the above XML instance example. This step should not be done until there is reasonable agreement about the XML instance example (section 4 above). This will be an initial schema.
<complexType name=”binsetType”>
<sequence>
<element name=”Identifier” type=”p:identifierType”/>
<element name=”Alias” type=”p:identifierType”
minOccurs=”0” maxOccurs=”unbounded”/>
<element name=”GeodeticDatum” type=”string”/>
<element name=”MapProjection” type=”string”/>
<element name=”BinGrid” type=”p:binGridType”/>
<element name=”Comment” type=”string”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
<attribute name=”id” type=”string” use=”required”/>
</complexType>
<complexType name=”binGridType”>
<sequence>
<element name=”BinGridOrigin” type=”p:binGridOriginType”/>
<element name=”ScaleReference” type=”p:scaleRefType” minOccurs=”0”/>
<element name=”NominalBinWidth” type=”p:nomBinWidthType”
minOccurs=”0” maxOccurs=”unbounded”/>
<element name=”GridBearing” type=”p:quantityType”/>
<element name=”RotationToIAxis” type=”p:rotationType” minOccurs=”0”/>
<element name=”BinNodeIncrement” type=”p:incrementType” maxOccurs=”unbounded”/>
<element name=”CheckPoint” type=”p:checkPointType”
minOccurs=”0” maxOccurs=”unbounded”/>
</sequence>
</complexType>
<complexType name=”binGridOriginType”>
<sequence>
<element name=”BinGridCoordinates” type=”p:binGridCoordType”/>
<element name=”MapGridCoordinates” type=”p:mapGridCoordType”/>
</sequence>
</complexType>
<complexType name=”checkPointType”>
<sequence>
<element name=”BinNode” type=”p:binGridCoordType”>
<element name=”MapGridLocation” type=”p:mapGridCoordType”/>
</sequence>
</complexType>
<complexType name=”mapGridCoordType”>
<sequence>
<element name=”Easting” type=”p:quanityType”/>
<element name=”Northing” type=”p:quantityType”/>
</sequence>
</complexType>
<!-- and some additional types. Namely:
scaleRefType decimal value with two attributes
nomBinWidthType decimal value with two attributes
rotationType select from an enumerated list
incrementType decimal value with one attribute
binGridCoordType Two subelements, I and J. Like mapGridCoordType
already built in our component library are
identifierType
quantityType
-->
It should be noted that changes can be made back and forth between the XML example, and the schema.
At the end of step 3, you should have an XML example that puts the data where you would like to have it, and a schema that supports the XML.
The next step is to modularize the schema. If we assume that the iterations have been made between the schema and the XML, then the schema in section 5 will support the XML. There are steps to take with the schema that will not change the XML, but will allow extensions and restrictions to be implemented.
Note: All comments will relate to the XML schema and XML instance example in sections 4 and 5.
All multiplicities should be checked (the minOccurs and maxOccurs). The multiplicities should be set for general use, and not just for the specific, first-designed use.
For example, you might consider making the MapProjection optional (set minOccurs=”0”) so that a group could give coordinates in geographic, rather than map projection coordinates. In that case, you would also need to change the Easting, Northing to a choice between (Easting, Northing) or (Latitude, Longitude).
Another example would be to make the CheckPoint optional. The P6/98 requires a check point (they ask for 3 check points), but a general use may not require those. Hence, add minOccurs=”0” to the CheckPoint element. User groups that want to mimic P6/98 can profile it to make CheckPoint mandatory.
In general, you are looking at the schema from a wider viewpoint than the usage scenario for which it was developed.
In the example above, only the RotationToAxis will come from an enumerated list: {clockwise, counter clockwise}. Do we want any of the other values to be restricted to a list?
There are other questions, also. Assume, for example, that we choose to restrict the values that go into the GeodeticDatum element. We could list the permitted datums. We could give a pattern, if we wanted to restrict the values to the epsg codes.
If we do wish to restrict some of these values, do we want to allow additional values? Should we extend the lists we have already defined? Do we want to allow the schema developers and/or the XML instance writers to extend the lists?
If we do not restrict the values, do we want to allow schema developers to insert an enumerated list when they profile it? For example, for use in the UK North Sea, the application developers may want to profile the GeodeticDatum to be restricted to {ED50, ED87, WGS 84}.
There are ways to allow, or prohibit, all of these when developing the schema - but we must plan for them and build them in.
There are two examples in the above of elements which could refer to other objects, rather than simply giving their names. They are the GeodeticDatum and the MapProjection. As developed, they now are simply a name (uncontrolled string value).
Here is how we could change the schema so that we would have the option of giving an XML instance as follows:
XML instance. href leads to an instance of a geodetic datum module
<GeodeticDatum href=”epsg4236”>ED50</GeodeticDatum>
XML schema to support the above
<element name=”GeodeticDatum” type=”p:nameAndRefType”/>
<complexType name=”nameAndRefType”>
<simpleContent>
<extension base=”string”>
<attribute name=”href” type=”anyURI”/>
</extension>
</simpleContent>
</complexType>
Note that P6/98 gives details about the geodetic datum and the map projection in that they give all the parameters defining these objects. We have pushed that set of information out of the Binset, by having an element that references geodetic modules. Hence, we give a way to record all the information, without specifically putting it in the Binset schema itself.
Are there other elements we should put it? For example, we might want to have a reference to zero or more acquisition surveys so that the parentage, as well as some who, when, etc. information can be recorded. This could be done by explicitly adding an additional, optional element to the binsetType.
After the Alias element, add
<element name=”AcquisitionSurvey” type=”p:nameAndRefType”
minOccurs=”0” maxOccurs=”unbounded”/>
There is another “additional elements” that should be considered. We might decide that we will add a “hook” for user communities to add the additional elements that they, themselves design. If we decide to add this hook, it takes away the requirement that we, the module developer, must add all the possibilities ourselved.
For example, we would add a hook that would occur just before the Comment. This would allow anyone to add anything they want at that point. Here is how the XML instance would look (the last bit of it, at least). Note that I am adding a namespace qualifier to what is being added, because it is an extension to the module built by another group.
. . .
</BinGrid>
<!--here is where we add whatever in -->
<z:BinsetProperties>
<z:BinsetEnvironment>offshore</z:BinsetEnvironment>
<z:Processor href=”JoeBleaux”/>
<z:IsSingleSurvey>true<z:IsSingleSurvey>
</z:BinsetProperties>
<Comment>This is an example of a bin grid for design</Comment>
</Binset>
There are straightforward ways to allow this. See other references for some of the methods.
Comment: Two papers referenced below deal with profiles: Policies on Modules [ModulePolicies] and Modules, Profiles, and Application Schemas [ProfilesAppSchemas].
This is not a paper of developing profiles. Since a module is generally developed with a particular application in mind, it is generally easy to define a profile for this use. As soon as another community of user tries to use it, however, a new set of needs crops up that were not considered by the initial group.
The profiling step offers feedback to the module owner that is useful for updating the module. For this reason, I include it as part of the module development process.
Some of the updates may be changes that do not affect the existing XML (for example, change the GeodeticDatum element to allow a profile that limits the values to an enumerated list). Other changes will be extensions (for example, you may want to add in the P6/98 parts that define the binset extent). Other changes may be so extensive that they change the structure of the module (change the Identifier so that is a single element with attributes).
All of these changes need to be considered in terms of backward compatibility, versioning, and the possibility of developing a new, alternate, module.
[ANSIX12] X12 Reference Model for XML Design, 2002-10, produced by the ANSI X12 committee, obtainable at http://www.x12.org/x12org/.
[BestPractices] Best Practices Homepage, developed and maintained by XML-dev and Mitre, obtainable at http://www.xfront.com/BestPracticesHomepage.html.
[ComProServ] PIDX XML Standards Master, Version 1.0, RP 3901, produced by PIDX, obtainable at http://committees.api.org/business/pidx/standards.htm.
[EBCCNAM] ebXML RT - Naming Convention for Core Component, 2001-05-10, produced by the ebXML group, obtainable at http://www.ebxml.org/specs/index.htm#technical_reports.
[ebTechArch] ebXML Technical Architecture Specification V1.0.4, 2001-02-16, produced by the ebXML group, obtainable at http://www.ebxml.org/specs/ebTA.pdf.
[FedDevGuide] Draft Federal XML Developer's Guide, 2002-04 (work in progress), produced by the Federal CIO Council, obtainable at http://xml.gov/documents/in_progress/developersguide.pdf.
[FedTagStds] Federal Tag Standards for Extensible Markup Language, 2001-06, produced by LMI, not obtainable from the internet.
[HKGuide] XML Schema Design and Management Guide, (4 parts), Draft versions dated in summer, 2003. Produced by Hong Kong Information Services Technology Division. Available at http://www.itsd.gov.hk/itsd/english/infra/eif.htm.
[IETFKeywords] Key Words for Use in RFCs to Indicate Requirement Level, 1997-03, obtainable at http://www.ietf.org/rfc/rfc2119.txt.
[ISO8601] International Standard Date and Time, 2001-11-10, produced by ISO. A web page that explains the formats is http://www.cl.cam.ac.uk/~mgk25/iso-time.html.
[ISO11179] ISO 11179 Part 5 - Naming and Identification, 1995-12, produced by ISO, obtainable at http://fdr.faa.gov/iso/ISO11179page.htm. There is a later version, that is available from the ISO website,
[UKGuide] e-Government Schema Guidelines for XML, 2002-12, produced by United Kingdom e-Envoy, obtainable at http://www.e-envoy.gov.uk/Resources/Guidelines/fs/en.
[Unicode] Unicode Charts, available at http://www.unicode.org/charts/.
[W3CSchemaDatatypes] W3C Schema Datatypes, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-2.
[W3CNamespaces] Namespaces in XML, 1999-01-14, produced by W3C, obtainable at http://www.w3.org/TR/REC-xml-names/.
[W3CSchemaPrimer] W3C Schema Primer, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-0.
[W3CSchemaStructures] W3C Schema Structures, 2001-05-02, produced by W3C, obtainable at http://www.w3.org/TR/xmlschema-1.
[Xlink] W3C XLink Specification, 2001-06, produced by W3C, obtainable at http://www.w3.org/TR/xlink/.
[Xpath] W3C XPath Specification, 1999, produced by W3C, obtainable at http://www.w3.org/TR/xpath/.
[XSL] W3C XSL and XSLT Specifications, produced by W3C, obtainable at http://www.w3.org/Style/XSL/.
POSC references are available in the following formats:
[html] html format readable by browsers
[doc] MS Word 97/2000/XP
[sxw] OpenOffice writer, v1.0
[IntroModule] Introduction to Modules, Copyright 2002-2003. Available in [html], [doc], [sxw].
[BuildModule] Build a Module - a tutorial. Copyright 2003. Available in [html], [doc], [sxw].
[ImportModule] Importing Modules within your Modules. Copyright 2003. Available in [html], [doc], [sxw].
[Guidelines] Guidelines for XML Schemas, Version 2003. Copyright 2003. Available in [html], [doc], [sxw].
[ModulePolicies] Policies on Modules. Copyright 2002-2003. Available in [html], [doc], [sxw].
[ProfilesAppSchema] Modules, Profiles, and Application Schemas. Copyright 2002-2003. Available in [html], [doc], [sxw].
[XMLTables] XML Tables. Copyright 2003. Available in [html], [doc], [sxw].
[ReferenceData] Reference Data and Enumerated Lists Implemented in XML. Copyright 2002-2003. Available in [html], [doc], [sxw].
[Dictionaries] Examples of XML Dictionary Usage. Copyright 2003. Available in [html], [doc], [sxw]. Accompanied by sample code.
[Relationships] Relationships in XML. Copyright 2003. Available in [html], [doc], [sxw].
[UOMRecs] Unit of Measure Recommendations. Copyright 2002-2003. Available in [html], [doc], [sxw].
1© 2004, Petrotechnical Open Standards Consortium, Inc. All rights reserved. All access, receipt, and/or use of this document is subject to the POSC Product Licensing Agreement posted on the POSC Web site at http://www.posc.org/about/license.shtml
2003-07-22 Build A Module Page