For more up-to-date information on XML projects, go to the XML Modules area.
One of the major contributions that POSC can make to the oil and gas industry is to define standard objects for transfer of data and information. When XML became a standard way for transfer, POSC began studying how to implement such a process. The specification of standard objects using DTD’s was limited, due to the limitations of the DTD. XML Schema allows these objects to be specified in such a way that they may be reused in standard exchange documents.
Go to the W3C page for information on XML Schema.
In developing these standard objects, it became apparent that there should also be consistent ways to develop the objects. These ways are grouped under the general heading of patterns, although it is difficult to separate the developed objects from the patterns. By combining these patterns with the objects that are being developed, it is becoming possible to more easily generate standardized exchange sets - even though these standardized sets are built for a particular use. In effect, we are finding it common to standardize on the pieces rather than on the whole.
The process of using pre-built modules involves collecting them from one or more web sites for use within a new application schema. When doing this, the schema developer jumps headlong into the issue of schema namespaces. The Namespace tutorial document will give some of the requirements necessary to use the POSC schemas.
Although somewhat of a side issue, it may be interesting to review the XML Schema: Best Practices. Much of the schema development discussed in that web site has been practiced here. Please be warned that this site is probably far beyond anyone's need to know.
There are three goals that are driving this project:
Note:Due to significant work since January 2001, the section on units of measure has been significantly revised.
Further Note: The Units of Measure dictionaries have been instantiated. The latest dictionary, effective on 2005-01-01, is found at http://www.posc.org/refs/poscUnits20.xml.
Units of Measure are a common problem in all exchange data sets. There are many approaches that are in use for specifying the units for a particular quantity. POSC, in conjunction with several other organizations, has developed a particular method for handling units that works in a very general sense. It is recommended that all exchange sets conform to this pattern. If exchange set developers conform to this pattern, it will be easier for applications to handle it.
Several documents and web sites have been developed in the past 5 months that describe the problem, propose solutions, and apply the solutions to particular problems. Although the text below can give an indication of the problems and solutions, you should go to the referenced documents to get a detailed picture of the units of measure problems and solutions.
The initial effort was to put forth the problem. The "Problems" document outlined several methods in which units of measure were being handled in XML sets. These were all taken from actual examples.
Several organizations became involved in an email discussion on best practices. From these discussions came the "Recommendation" document, which details several recommendations and patterns which will allow interoperability when dealing with units of measure.
The document was worked on actively, and received several revisions. In order that active workers could keep track of the changes being made, an earlier version of the recommendations document is available.
Supporting the recommendation is the XML Schema. Also, a part of the recommendation is that a units dictionary be developed. To support the recommendation, a sample xml file with an xsl was developed. The xsl demonstrates how to access the information in the dictionary files.
Finally, it is useful to see a schema that incorporates the units of measure pattern into it. See The CSIRO site web page.
A list of ways that units are handled is:
The problems document gives the advantages and disadvantages of each method. It is this last method that is used in POSC.
The basic pattern is as follows:
Here is a sample of how this is done. The sample will define the metre very simply, and then will define the US Survey foot by giving the conversion values to a metre.
<UnitsDefinition>
<UnitOfMeasure uid="m"/>
<UnitOfMeasure uid="ft" acronym="US ft">US Survey foot
<ConversionToBaseUnit baseUnit="#m">
<numerator>12.</numerator>
<denominator>39.37</denominator>
</ConversionToBaseUnit>
</UnitOfMeasure>
</UnitsDefinition>
Since the units are defined, they can be referenced. In this case, only the "m" and the "ft" can be used to reference units. The sample below shows the referencing:
<Ellipsoid flatteningDefinitive="no">
<identifier>Clarke 1858</identifier>
<semiMajorAxis uom="#ft">20926348</semiMajorAxis>
<semiMinorAxis uom="#ft">20855233</semiMinorAxis>
</Ellipsoid>
<Ellipsoid>
<identifier>Bessel Namibia</identifier>
<semiMajorAxis uom="#m">6377483.865</semiMajorAxis>
<inverseFlattening>299.1528128</inverseFlattening>
</Ellipsoid>
Note that the data itself is easy to read. Furthermore, the meaning of the units m and ft are clearly defined in the file. POSC recommends this pattern for handling units.
The XML Schema, technical comments on it, examples, etc. are given in the Unit of Measure Recommendations paper referenced earlier. Examples of its use are also given in the other documents referenced above.
When developing schema, there are some datatypes that show up often enough that they can be abstracted out, and used in other schema. Following is a list of such datatypes, and the link to a more detailed description of them:
In addition, there are types that have been defined to support the Epicentre mapping. These will probably only be useful in the context of transferring Epicentre datatypes. They are listed here:
The goals of the PEF XML project were to develop a method for mapping Epicentre into an XML exchange document. That goal was accomplished in part by mapping each of the Epicentre data types into XML. The document, Exchange Format, details this mapping.
Because of the methodology of this mapping, it is also possible to apply it to other data models. Any data model that uses the Epicentre data types (or a subset of the Epicentre data types) can use the information in this document to form an XML file.
Both of these goals are mentioned in the Exchange Format document. This note goes a step beyond that to discuss the use of the data types themselves.
It is possible to use the Epicentre data types independently of any data model. For example, the timestamp, the date, the quantity, the complex number, etc. structures may be useful outside of any data model that specifically uses these data types.
Consider, for example, the tag, "spudDate." How should a spud date be represented in XML? One possibility is to make it a parsed date, as was done in the data types schema:
<element name="spudDate" type="parsedDate">which would lead to an XML such as
<spudDate> <year>1999</year> <month>6</month> <day>24</day> </spudDate>
Use of this predefined structure not only makes the schema document easier to develop and understand, but it also increases the interoperability. If all applications that needed a parsed date structure were to use this data type, the structure would be well-defined and well-understood. Applications that understand this structure could then be reused with other XML documents.
Another example of its use would be if the schema developer wishes to give a choice of date formats. For example, she may wish to give the option of a parsed date, an ISO formatted date, or a US formatted date. She could then define the spudDate element as
<element name="spudDate">
<complexType>
<choice>
<element ref="pef:date"/>
<element name="isoDate" type="string"/>
<element name="USDate" type="string"/>
</choice>
</complexType>
</element>
where the "date" element is already defined in the data types document, and the other two date types can be defined in the present document.
Details on use of the Data Types and the data type elements can be found by referring to the document, Data Type Usage.
Dates can be in one of five forms:
The date is broken into its parts. There are separate tags for year, month, and day. As presently constituted, the year is four digits, the month is one or two (1-12), and the day is one or two (1-31). Direct use of the parsedDate data type will give this structure. Note that the parsedDate data type allows year, month, day, or year, month, or year only as inputs.
The ISO date is defined by ISO 8601. See the Summary Web Page for a description of the ISO date format. In essence, it is of the form YYYY-MM-DD. XML Schema implements the full format in its date data type.
The W3C XML Schema defines the following date data types: date, year, month, century, recurring date, and recurring day. These correspond to various portions of the date: full date, year only, year and month only, century only, month and day only, and day only. These may be used (and combined) as needed. Note that the full ISO date format allows all of these choices also.
There are formats defined locally. The two major ones are the US format (MM-DD-YYYY) and the European format (DD-MM-YYYY). Note also that locally defined formats may replace the "-" with "/". Clearly dates in these formats are ambiguous, which is why ISO defined a standard format. However, these formats may be used, provided the document details the format and its meaning.
In some cases, a date is given with no knowledge of its meaning. Clearly, 06/02/98 is ambiguous. However, the undefined text format allows dates of this format to be exchanged, with the understanding that the meaning of it is unknown. This often occurs when the input file uses such an undefined format and the user wants to keep the information, but has now way of interpreting the information.
Whenever a date is to be included in an exchange set, the specification document must choose one or more of the alternatives above. While the parsedDate and the W3C Formats are predefined for users, the others must be specified either using XML Schema patterns or text descriptions in a written document. It is clearly up to the users to decide which formats to allow and how to specify them.
However, there are guidelines that should be understood and followed.
Because of language differences, the use of the string for a month (JUN instead of 06) is discouraged.
It should be noted that style sheets and other applications can convert the standard date formats into readable strings, such as 24 Dezember 2001, or December 24, 2001. Readability can be added at many stages. However, the meaning must be clear, and interoperable, in the exchange set.
Wells, leases, fields, buildings all have locations. The means of giving a location to these features varies - depending on the object and the requirements of the receiver. The AbstractLocation data type is a combination of four methods of giving a location:
The AbstractLocation data type is an attempt to gain interoperability using a single structure whenever any location of the above is needed. Here is an example of its use:
XML Schema:
<element name="WellLocation" type="posc:AbstractLocation"/>
<element name="BottomholeLocation" type="posc:AbstractLocation"/>
Sample XML:
<WellLocation status="actual">
<GeopoliticalLocation>
<country code="US">United States</country>
<state>Texas</state>
<county>Val Verde</county>
</GeopoliticalLocation>
<SurveyLocation>
<srsName>NAD 83</srsName>
<gml:location>
<gml:Point srsName="epsg:4267">
<gml:coordinates>27.2529953,-101.966394</gml:coordinates>
</gml:Point>
</gml:location>
</SurveyLocation>
</WellLocation>
...other information...
<BottomholeLocation status="proposed">
<SurveyLocation>
<srsName>NAD 83</srsName>
<gml:location>
<gml:Point srsName="epsg:4267">
<gml:coordinates>27.2529681,-101.984422</gml:coordinates>
</gml:Point>
</gml:location>
</SurveyLocation>
</BottomholeLocation>
In addition to the full, AbstractLocation object, there are intermediate objects that can be used (or restricted). For example, there is an offshoreLocation object. If the full AbstractLocation object does not meet the needs of the application, a lower level object can be chosen. A description of the lower level objects, and of how to incorporate them into the schema, is described in the document on Usage of Location Object.