November 18, 2002, Landmark, BMC Campus, Houston, Texas USA
These are notes from the second working meeting of the DSS SIG.
Meeting Slide File (PowerPoint)
Introduction
Status Update
Summary of Previous Work
Participant Presentations
September 30 Cataloguing Workshop
Consensus Building
Outline Recommendations Document
Agree on Action / Work Items
Preliminary Charter for 2003 January through June
Oil Companies
Anadarko (Bob Kline)
ExxonMobil (Boyer Purnell)
ONGC (A. K. Tyagi)
Pioneer Resources (Carol Anne Doherty)
Shell (Erik van Kuijk, regrets)
Government Agencies
UK DTI (Stewart Robinson)
US MSS (Michael Celata)
Service, Software, and Consulting Companies
ETL Solutions (John Bigerstaff)
Halliburton/Landmark (David Johnson, Gene Rhodes)
IMS Corporation (Mikhail Damaskine)
Oilware (Harry Schultz)
Paras (Flemming Rolle)
Petris Technology (Jeff Pferd)
Schlumberger (Jay Hollingsworth)
POSC (Alan Doniger, Paul Maton and David Archer)
Gene Rhodes from Landmark welcomed the attendees and described the site logistics.
The attendees introduced themselves and stated their expectations.
Alan explained the purpose, form, and processes associated with POSC Special Interest Groups in general and the Data Store Solutions SIG in particular. (See slides 5-7.) He listed the organizations that have joined the SIG or that are in the process of joining, noting the organizations that participated in the October 17 working meeting in London [marked "L"] and those that are participating in the November 18 working meeting in Houston [marked "H"]. (See slide 8.) Alan then explained the DSS SIG in more detail. (See slides 9-16)
Participants were asked to participate in the idea generation and consensus building process, the recommendation writing and editing, and the promotion of the recommendations. (The amount of effort may vary from participant to participant with a guideline of several days per month.)
There was a discussion (See slide 12.) about the motivation for this (and any other) SIG), i.e. to seek opportunities for consistency that can improve the quality and productivity of E&P business and technical activities particularly related to data, information, knowledge, and their use in work practices where there are no concerns for competitiveness.
The theoretical line between subjects about which organizations are comfortable to collaborate and subjects about which organizations intend to compete was defined as the "line of coopetition". There was a common understanding that if the SIG finds itself moving over this line into areas of competitive sensitivity, then the SIG should reorient its work to stay in areas of agreed collaboration. It was further understood that identifying the line of coopetition is not necessarily easy, but that by engaging in open preliminary dialogue, the SIG participants will be able to scope it out.
The general characteristics of POSC Special Interest Groups were presented and discussed to put the objectives for the DSS SIG into a context. The DSS SIG milestones to-date and as planned through year-end were presented. The interplay among the DSS SIG, other POSC entities, and industry organizations was described.
The stated objectives for the DSS SIG for the initial work period (through year-end) are as follows:
Focus on Corporate Data Stores and associated business processes and information items that support field development planning.
Steps:
Develop an agreed, common data management framework definition, addressing data store architecture, data management business processes, deployment use cases, and cataloguing.
Align participant recommendations with framework concepts and terms.
Consolidate participant recommendations using consensus building processes.
Express highly rated consensus recommendations as to oil company work practice, supplier product and service, and standards recommendations.
Publish and promote the recommendations.
Prepare for the next six-month DSS SIG work cycle.
All attendees received paper copies of the slides used at the October 17 working meeting in London and of the notes from that meetings. The London meeting notes were reviewed and, in large measure, the Houston attendees agreed with those findings and results. The review was done in three parts; Framework (See Slides 18-26.), Cataloguing (See Slides 27-33) and Recommendations (See Slides 34-41).
We reviewed the concepts, terms, and usages associated with data management and the use of data store products and services based on the results from the London meeting.
(Slide 18)
Knowledge, information, and data form a continuum. No effort should be made to permanently label things as being one of these three. Every thing, in essence, is a combination of the three. Perceptions about the three depend on the viewpoint being applied as much as on inherent character.
The names used to refer to the various kinds of data stores are usually well understood, but one can not depend on such names to convey a complete and unambiguous understanding.
Even the term data store, which is close to the more implementation oriented term data base, would be more properly called a KID store or an information store so that the content is understood to include knowledge, information, and data. By extension, the name of the SIG might be changed to KID Store Solutions or Information Store Solutions. Familiar usage, however, suggests that we continue to use data store.
(Slide 19)
The most fundamental distinction in between data stores directly involved in business operations and those primarily intended for long-term storage.
Active Use Data Stores.
Operational Data Store: supports a direct and ongoing business activity, such as field operations during the production phase of an E&P asset life cycle.
Project Data Store: supports an investigative or analytical, relatively short-term business activity usually concluding with a recommendation leading to a formal business decision.
Long-term Data Stores.
Master Data Store: supports the management of acquired and/or measured sets of data, often for multiple organizations using entitlement-based access rules.
Corporate Data Store: supports the management of the results from operational and/or project business activity, often limited to best or high-quality results.
Warehouse Data Store: supports the management of selected information items routinely loaded from other sources, often including financial and commercial results.
It was noted that classifying types of data stores may be of interest to a limited number of people, including those involved in this SIG. For most people, these distinctions are of little interest. What is of interest, however, is to support work processes in as simple and straight-forward a manner as possible.
(Slide 20)
Just as high-level distinctions are not made among knowledge, information, and data, so too high-level distinctions are not made between documents (information items perceived as documents -- electronic and/or paper based) and data model based digital information. Therefore, Electronic Document Management Systems (EDMS's) are not a distinct high-level type of data store.
There are basic characteristics and expected behaviors for each type of data store. In practice, however, it is not unlikely to find hybrid data stores exhibiting the behaviors of multiple types of data stores. SIG recommendations should be based on the characteristic behaviors.
There was further discussion about conceptually unifying the concept of information items (sometimes referred to as 'documents') to include both perceived documents and dynamic query responses from digital sources.
[The term meta-data is a relative term to whatever sorts of data-information-knowledge are considered the base of discourse. As such, care must be taken when using the term 'meta-data' without adequate contextual qualification. In the area of document cataloguing, cataloguing attributes are meta-data to the catalogued documents. In the case of, say, well log trace data, meta-data may include well and wellbore contextual information.] It was noted that the common descriptors to both perceived documents and query responses are meta-data to each in turn.
Our motivation in trying to discern which are fundamental differences between types of data stores and which are more incidental differences is to avoid splitting concepts in the wrong places.
It was also noted that a single commercial product is often used to fulfill more than one basic data store behavior and purpose. This can tend to cause confusion. In our discussions, we are referring to data store behaviors rather than to behaviors of specific data store products.
(Slide 21.)
Warehouse Data Stores are designed largely to respond to anticipated queries, use well-defined periodic or event-driven loading procedures for defined information item types coming from reliable sources, and do not allow updates other than through the defined loading procedures.
Warehouse Data Stores often blend together with Corporate Data Stores. There are no essential differences between them other than an emphasis on formal storage management versus emphasis on high value and quality.
(Slide 22)
Corporate Data stores are designed largely to portray best and high quality results and use well-defined periodic or event-driven loading (publication) procedures from operational, project, master, and other sources.
Corporate Data Stores can exhibit two kinds of update behaviors. Portions do not allow updates other than through the defined loading procedures. Other portions are updated by designated authoritative applications and procedures. These portions serve as reference source material for use in operational and/or project data stores.
Attendees were asked to consider the bi-directional nature of the relationship between Corporate Data Stores and Project/Operational Data Stores to determine whether there is a clear characterization of the separation between the two directions. Business reference information is managed directly in Corporate Data Stores and provided to Project/Operational Data Stores.
(Slide 23)
Despite the use of the term corporate, Corporate Data Stores may have any scope of coverage, e.g. global, regional, asset, etc. Federated Corporate Data Stores can be virtual compositions of multiple Corporate Data Stores each of smaller scope.
The contents of a Warehouse or Corporate Data Store is catalogued to the level of the defined information item types. Application updated information is catalogued as virtual information items, i.e. information items of known identity by variable content.
(Slide 24)
Master Data Stores often contain a small number of information item types of acquired or measured information (as opposed to processed, analyzed, interpreted results). In practice, other than acquired or measured information may be included.
The distinction between acquired and measured information in Master Data Stores and processed, analyzed, and interpreted information in Corporate Data Stores is not fundamental or pure. For example, a depth-shifted well log may be thought of as processed, but may still be stored in a Master Data Store, where typically raw field measured data resides.
Master Data Stores often are managed external to active business units, are shared by multiple business units or companies, and are accessed using subscriber entitlement rules.
(Slide 25)
Loading the defined information item types following prescribed quality control procedures should be the only manner by which Master Data Stores are updated.
Historically, Master Data Stores have been used primarily for geoscience information, but other types of acquired or measured data may be included, e.g. real-time well-site acquisition data.
The contents of a Master Data Store is catalogued to the level of the defined information item types. This forms the fundamental association with Corporate and Warehouse Data Stores.
(Slide 26)
There are no absolute minimum or maximum boundaries for the scope of Project Data Stores in terms of duration, geographic extent, etc.
Given the active and dynamic updating of information in Project Data Stores, the primary means of ensuring against accidental data losses is frequent, periodic backups.
Basic integrity and quality are issues for Project Data Stores to support business decisions and decision-support software applications.
At defined milestones during a project, the content of a Project Data Store is archived and/or published to a Corporate Data Store according to mutually agreed filtering, mapping, and quality control processes.
A Project Data Store's scope of content may include one or more implemented data stores plus related digital and non-digital information sources.
A Project Data Store's contents is catalogued as a series of virtual information items, i.e. information items of known identity by variable content.
The importance of retaining corrected technical data was asserted. (This turned out to presage a longer discussion on the re-usability of technical results data later in the meeting.)
As the second of the three parts of the review of the London meeting results, Alan described the history of the cataloguing standards initiative from the first contact from Shell Expro (UK) with POSC in February 2002 to the two industry workshops (Aberdeen in March and Houston in September). See Slides 27 - 33 and September 30 Workshop Web page. Some of today's attendees will hear Alan Doniger's presentation on this subject at the POSC Autumn Member Conference in Houston on November 21. Alan identified the participating companies from the Sep. 30 workshop and noted that many are also SIG participants. He then gave a brief overview of the main concepts in reference to the slides on this subject, concluding with a review of the plans for progressing the catalogue attributes and vocabularies through the interaction of users, the SIG and the POSC cataloguing specification team.
Alan described a recommendation presented first to the September 30 workshop in Houston that the DSS SIG be the host for the cataloguing initiative in terms of determining consensus requirements and then publishing and promoting these as recommendations. It was agreed that it is best not to separate the cataloguing initiative from the broader consideration of data store solutions recommendations. Alan explained that this position could be revisited in the future, if warranted.
Discussion points:
Some attendees encouraged POSC to encourage the industry to work quickly to agree on cataloguing attribute vocabularies to avoid unnecessary divergence in usage. It was noted that some degree of divergence is inevitable and the POSC is prepared to publish mappings to vocabularies that are or have been in use.
It was noted that Information Type (K-I-D Type) may be different between producer and consumer. This point should be considered further.
There was some consideration of the several fairly independent types of non-standards-body standards that an oil company must consider, including those imposed by regulatory reporting agencies, those de-factor standards from the few commercial data sourcing services, and those de-facto standards that are manifest in the details of the internal operation of various commercial application products. It was noted that any progress in identifying and documenting such standards by POSC would be useful.
In the third part of the review of the London meeting results, the attendees discussed each of five recommendation subjects. The results of these discussions are summarized below. Note that the slides referenced in this section were largely slides as drafted before the London meeting. The subjects are presented in order of perceived importance, beginning with the most important recommendation subject.
Recommendations in line with the cataloguing subject were seen as being the most important of all.
It was noted, however, that this initiative will not work out well unless cataloguing becomes embedded progressively through our work flows. As an add-on or after thought, it will not work well.
(Slides 35-36)
Industry standards for sets of reference values should be defined for use across all types of data stores. Industry standards may have to be augmented (and/or restricted) at the company, business unit, or project level. Depending on the type of data store, local or temporary extensions may be acceptable until a defined milestone is reached at which time extension reference values should no longer be used. Extensions may evolve into business unit, company, or even industry standards through appropriate processes.
An Action Item was agreed in the London meeting for attendees to propose a list of most important sets of reference values to be addressed by the DSS SIG. This action was endorsed and extended to include the Houston attendees.
Recommendations concerning Reference Value standards were seen as being the second most important of these subjects.
There was some discussion about the use of mappings between pairs of reference values and the problems of reference values that change over time. It was generally agreed that external mappings were preferable to the practice of changing reference values inside information items.
(Slide 37)
There was some discussion of the slide content, which, in the end, appears to have been too prescriptive. The three kinds of publishing (plus cataloguing all of these) were generally accepted as possible desirable aspects of project results publishing, conditioned by the debate of the re-use of project results described below.
There was support for seeking clarity and consistency in the description of the content of reports of final results from project work and reports at key decision points. Reference was made to final reports, end-license reports, drilling recommendations, not-to-drill recommendations, lease sale recommendations, etc.
There was a lively debate over the value of preserving the technical results of project work for re-use in the future. Some held strongly to the position that geoscientists have not and will not often use previous results in favor of reprocessing and re-interpreting from original data. Others held that there is significant value in old results as long as there is sufficient contextual information to ensure a complete and correct understanding of them. This includes incorporating the work flow followed with the results themselves. Such value begins with lessons learned and may extend to direct re-use of technical results. All agreed that the value of old results diminishes over time, but the rate of reduction was a matter of disagreement.
Confirming the thoughts of the London meeting, it was agreed that a recommendation be made on the general theme of publishing project results at a high level. Membership and industry feedback on such high level recommendations can help the SIG determine whether more detailed recommendations can be made at a later time.
It was generally agreed that publishing from project or operational sources at the time of an interim or completion milestone requires the application of quality control rules and/or filtering to separate out what is worthy of publishing.
(Slide 38)
If a recommendation on the elimination of paper records is taken forward, the recommendation should be stated more positively than in the slide text, i.e., enable practices which reduce the need for paper records.
Such a recommendation should take into account the issue of trust, i.e. the need for digital signatures and other security measures to replace the traditional trust associated with conducting business through paper records.
The consensus of the Houston attendees was to roll a simple recommendation in this area into the Publishing subject area. This would be something to the effect that in order to help reduce the need for the retention of paper records, electronic / digital forms should be produced, retained, and catalogued for all documents that are produced for use in paper form.
(Slide 39)
There was agreement with the London suggestions that an inter-company data transfer recommendation should express minimum requirements for exchanges in limited scopes and that this may be a good context in which to support recommendations to pursue and promote standards for sets of reference values.
In addition to considering high-value categories of inter-company data transfer to address, it was suggested that such transfers be related to the project publishing subject above. The open question is: how often are inter-company transfers associated with the attainment of project/operational milestones and how often are inter-company transfers part of routine project/operations activity.
This was considered an areas for recommendations of lower importance from those listed above. The question was raised about the perceived success or failure of various previous efforts to define narrow or broad transfer standards. Opinions differed as to the usefulness of different prior efforts. The goal of reducing time spent seeking, qualifying, reformatting, and moving information is still worth pursuing as estimates remain in the range of 50% of geoscientists time.
(Slide 40)
Alan sketched the concept initially suggested by Knut Tungland of Statoil at the London meeting for integrated data viewers based on (XML) standards for data content of various types.
The text on the slide was thought to be too specific and the recommendation in this area should focus more directly on the original concept.
There was some discussion of the boundary between generic, light-weight viewers and proprietary, heavy-weight viewers. There is not intent to limit or constrain proprietary viewers (or other types of application functionality for that matter) to use only data defined in generic viewer standards.
The linkage with reference data standards was noted.
The attendees were invited to offer brief comments and advice related to the work of the DSS SIG. Before beginning these talks, the presentations made in the London meeting were identified. Attendees were invited to see the slides, where available, for more information.
John reviewed a set of PowerPoint slides (not yet available) on work underway to generate useful data exchange materials from UML data models.
David Johnson presented a set of PowerPoint slides (not yet available) on the data management approach being taken by Landmark.
Stewart described the plans for developing a UK data catalogue hosted by DEAL and based on XML standards for data interchange. Catalogue specifications to be known as PON9 will link through to information of various kinds provided by UKCS operators.
A. K. provided some observations and advice for the work of the SIG.
The first rough draft of the recommendations document will be prepared to be progressed at the December 5 DSS SIG working meeting.
The attendees concurred with the London recommendation that it is too early to make a public presentation about the work of the DSS SIG at the POSC Annual Member Conference in Houston on November 20 and 21. A suitable future opportunity for a public presentation will be determined.
The next cataloguing workshop will be held on December 4 in Stavanger, Norway at NPD.
The next DSS SIG working meeting will be held on December 5 in Stavanger, Norway at NPD. The primary objective for this meeting will be the review of the first rough draft recommendations paper.
The meeting notes should be prepared and posted by the end of November.
Each London and Houston attendee will prepare and send to Alan Doniger (doniger@posc.org) for distribution to the SIG the following: a list of the top ten, i.e. most important, sets of reference values. Please do so in time for review of the lists at the Stavanger meeting on December 5.
The goals for the first half of '03 will be to receive and analyze feedback from the current round of recommendations, to develop these recommendations further, and to monitor progress based on these recommendations.