from EDI to XML/edi

by Robert Aydelotte

send us comments on this paper


Discussion

What

EDI is currently both a technique and a technology. As a technique, electronic data interchange is the business of reliably exchanging data between independent computing systems, such as between a vendor and customers. As a technology, EDI is the means of formatting and transmitting that data.  Together, these are traditional EDI.

Within a few years, XML, combined with eBusiness frameworks built on XML, will probably displace EDI as the dominant technology for electronic data interchange (the technique). EDI technology will not disappear, but it will become a more specialized and limited use technology.

This discussion explicitly assumes that electronic data interchange as a fundamental business technique will continue and thrive.

Why

This change is due to XML-based EDI (XML/edi) being cheaper, faster and better than conventional EDI for uses that involve human interaction. This paper presents these arguments and provides some justification for these conclusions from current, though limited, experience.

How

The transition from EDI to XML/edi will occur at different rates depending upon current EDI commitments. Most businesses not currently using EDI will begin to use XML/edi once (a) the standards are well established, (b) the commercial tools are widely available, and (c) it is supported by either a significant business partner or as a vendor service.

Those currently using EDI will have to choose between several options. These choices will be influenced by both internal (e.g., cost savings, higher degree of human interaction needed, connecting with new partners) and external factors (e.g., changes by business partners, new legislation). The spectrum of possible change includes:

  1.  Make no changes because (a) VAN (Value Added Network) security and reliability cannot be compromised, (b) EDI formats are very stable, and (c) all business partners are established and remaining on VAN-based EDI;

  2. Switch from VAN-based to Internet-based EDI, continuing to use the traditional EDI format;

  3. Encapsulate the traditional EDI format within a wrapper of XML tags, with the degree of replacement gradually increasing with time; or

  4.  Install a parallel XML/edi system that will eventually replace the traditional EDI system

When

The milestones that will impact the timing of XML for electronic data interchange are:

  1. the standards are well established,

  2. the commercial tools are widely available, and

  3. it is supported by either a significant business partner or as a vendor service.

The second and third conditions are being worked in 'internet time', suggesting that a significant number of commercial tools will be available before the end of 2000, and that the use of XML/edi will be attractive within two years. Further, impressive but isolated uses will be available sooner.

The process of creating and adopting standards has not progressed to 'internet time', but normally proceeds in 'bureaucratic time' or 'geologic time'.  Assuming that XML/edi will be based upon the popular adoption of proprietary standards (de facto rather than de jure), then three to five years may be required for XML/edi to challenge the dominance of traditional EDI. This delay helps to explain the emergence of many new 'standards building' initiatives today.

A Brief Description of XML and EDI

EDI

Electronic Data Interchange (EDI) was developed in the 1970's to enable the computer systems of buyers and sellers to exchange pre-defined data over a secure, reliable network. This provided fast and accurate data that was otherwise subject to processing and handling delays inherent in surface mail, faxes and data entry.

EDI works by encoding data in a very specific, standard format. EDI communication consists of one or more messages, each conforming to a subset of the standard format. Within each message, delimiters define a series of segments, the first three characters of which contains the code for the type of the segment. Within each segment, a series of codes and numbers (separated by another delimiter) define the data being transmitted. Due to the extensive use of very compact codes and concise data structures, EDI messages are almost impossible to humanly create and interpret without extensive documentation.

The EDI file is then sent from the sender to the receiver through either a direct model connection, via a private network connection or through a Value-Added Network (VAN). Costs for using a VAN are based on frequency and volume, and the VAN supplies additional security and tracking functions. Recently, transmission via Internet has become possible.

The use of EDI is possible only because of the agreement between sender and receiver on the use of codes imbedded in each message. These agreements are captured in a transaction set. Initially these formats and codes were developed ad hoc, with standards being introduced by national (e.g., ASC X12 in the USA) and international bodies (e.g., UN/EDIFACT) beginning in the 1980s. These bodies have continued to maintain existing and develop new transaction sets as needed.

XML

The Extensible Markup Language (XML) is a World Wide Web Consortium (W3C) Recommendation that defines data meaning rather than its presentation. XML is a subset dialect of the Standard Generalized Markup Language (SGML), which was developed to interchange technical documentation and other forms of publishable data.

XML works by allowing users to define a heirarchical set of tags that are embedded into a file that contains the information being communicated. The tags (with starting and ending forms) explain exactly what the data in the tagged section of the document is intended to mean.

Each XML document may explicitly declare the source of the tag set that it employes, thus providing extremely valuable information to the reader of the file. Each set of tags is defined in a separate (usually web accessible) file, presently in the form of a document type definition (DTD). Because DTDs are very limited in their definitions of data types, an additional W3C specification for XML Schema will replace DTDs as the means to define XML tags before the end of 2000.

XML documents today generally contain data for a particular domain, e.g., a purchase order, a well log, a production report, etc. These are application level specifications for various kinds of data. In the near future, however, many XML documents will begin incorporating additional tags that establish a processing framework for the domain information within the document. These tag sets can be thought of as interoperability level specifications. There are several active initiatives that are specifying frameworks for document processing, e.g., BizTalk, RosettaNet and ebXML. The W3C has produced a white paper comparing a number of XML protocols that fall into this category. To date, however, none of these initiatives dominate the commercial marketplace.

In addition to these definitional components, methods for applications to interact with XML are defined with the Document Object Model (DOM)  and the Simple API for XML (SAX). Another significant tool is W3C's   XSL Transformation (XSLT), which allows XML messages to be subsetted, reordered and converted into other forms using reusable, recursive templates. This allows XML to be converted to HTML for conventional web browsing, or into application specific structures for data input.

Some Contrasts between EDI and XML

  • EDI developed for computer-to-computer interchange, while XML developed for human-to-computer interchange without sacrificing computer-to-computer interchange
  • EDI formats are externally defined, while XML is self-described
  • Standards for using EDI exist and are widely adopted whereas standards for using XML are not yet widespread (such as data valiadation rules)
  • Free and inexpensive commercial tools that now support XML utilization have been very expensive, one-off applications for EDI
  • Native XML files are significantly larger than corresponding EDI files, and while compression mitigates this issue for transmission, storage and network infrastructure can be a concern for large files[i]

The XML/edi Drivers

These are the drivers for the movement from traditional EDI to XML based EDI:

Cost

Most businesses have not adopted traditional EDI:

  • About 2% of the world's businesses now use EDI. XML/edi has the potential to take this to about 70-89%.[ii]

  • About 99% of small to medium enterprises (SMEs) in Europe are reluctant to deploy expensive EDI systems.[iii]

  • Large retailers with EDI have only 20% of their suppliers using EDI.[iv]

Private networks are more expensive to use than the Internet:

  • XML-based messages over the Internet are half the cost of the same message over a private network (VAN).[v]

  • New devices are making XML-based transactions much faster, more reliable and more secure than previously possible[vi]

XML/edi is cheaper to build and maintain than traditional EDI:

  • In an investigation of using an XML-like subset of SGML in telecommunications, a five-fold reduction in maintenance costs was the benefit due to the introduction of a "pivot" between applications and an EDIFACT-based system.[vii]

  • Using XML could reduce the costs of processing purchases for just one large customer by about 50%.[viii]

  • Traditional EDI systems are seven to ten times more expensive than Internet-based options.[ix]

Functionality

There are several technical drivers that contribute to the adoption of XML/edi:

Extensibility

The rigidity of traditional EDI is both a strength and a weakness - as it is expensive to change, it is also resistant to change. Resisting change has proven to be very difficult however, with many standards being updated on an annual basis.

XML has been designed to be easy to change, with a bag of tools (e.g., XSLT) to make change easier and less expensive.

This is the most common justification for XML at this time.

Content Management

Since XML is based upon SGML, it has a great deal of capability when dealing with human readable text. XML allows the organization of information to more closely reflect the forms that human readers understand while retaining a capability for machine processing. This provides an efficient structure for evaluating content for information that has heretofore been unmanageable prose.

This is an area where new XML capabilities are being developed at a very rapid rate.

Data Integration

As data is moved between formats (i.e., observation à paper à file à database), it is often altered to fit the requirements of the new format. These alterations often degrade the original meaning (colloquially, 'data rot'), suggesting that the frequency and degree of change should be minimized.

XML provides a methodology to describe the structure in which data resides (context), and these XML structures can be widely implemented, e.g., in sensors (observation), documents (paper), messages (file),  applications (memory) and databases (disk storage). While not claiming universal applicability, XML-based data structures are significantly more resistant to the 'data rot' (decrease in value due to loss of context) that naturally occurs in most business processes.

The impact of XML on data is illustrated in the following diagram:  

  • Data is commonly distributed between applications, databases and documents, each with unique shortcomings

  • Persistence for data in applications is improved by allowing data to be moved out of applications and into both documents and databases more easily and with less semantic loss

  • Portability for data in documents is improved by making data in documents more compatible with the more structured form needed for input by applications and databases

  • Publication for data in databases is improved by making database contents easier to locate, read and write through reducing the cost (and losses) of converting data into and out of a highly structured form

This aspect of XML is not widely promoted at this time, but it may become the primary justification for XML within a few years.

Experience with XML in EDI

There are indications of many investigations examining how to employ XML technology in electronic data interchange, but few have published results. As additional information is discovered, additional sections will be added to this document.

ISIS European XML/EDI Pilot Project

This project was sponsored by CEN/ISSS to study the feasibility of XML for electronic data interchange, and was completed in January 2000. This is one of the most comprehensive investigations to date, and extensive results have been published (use link above).

In scope, it examined conversion from UML- and EDIFACT-based systems for healthcare and transportation, respectively. It also studied the utility of auxiliary XML processes and specifications (such as XSLT) to determine what components may be missing from XML tools today. Also, the project chose to employ early XML Schema specifications instead of DTDs.

The deliverables of the study included a description of best practices, software demonstrations and recommendations for standardization. The focus was on providing guidelines rather than solutions.

Lessons learned from both the UML- and EDIFACT-based investigations included:  

  • XML is capable of electronic data interchange using currently available tools

  •  Original standards need to be simplified when converted to XML, such as normalizing data, removing codes, defining defaults and subsetting

  •  General structures need to be converted to hierarchical structures, often with rules to facilitate automatic implementation

  • Mnemonics and programming-style names need to be edited to produce meaningful, human readable tag names

  • Chains of XSL Transformations allows application tailoring and simplifies applications by supporting localized XML DTDs, converting between forms (EDIFACT, WML, local format, etc.) and presenting as HTML

  • While the current set of specifications are adequate (XSLT, DOM, XML Path, and XML Schema), several necessary improvements were proposed

The recommendations produced by this project for other projects included:

  •  Utilize guidelines

  • Semantic repositories with maps to XML are needed

  • Use schema archetypes and XML Paths

  • Re-use generic XML data structures

Their recommendations on the use of XML included:

  • Keep it simple, standardized, speedy

  • Start with clear information requirements

  • Use XML and its family of tools

  • Design applications as a sequence of transformations

  • Semantic harmonization is the key to the future of XML/edi

Next Steps

Learning from EDI

Users of traditional EDI have learned many lessons dealing with data interchange. When looking to extend the scope of XML-based interchange to include both new data items and new users, it will be important to build upon these lessons rather than repeat the mistakes that taught them.

Some areas from which EDI can provide valuable lessons include:

  • content requirements - the data items, definitions and processing rules for data interchange are already established and implemented by traditional EDI users for existing business processes

  • process requirements - the manner in which data is handled satisfies both current business processes (internal and partner requirements) and legal conventions

  • specification management - the processes of creating, using and maintaining the shared specifications upon which electronic data interchange is based

Managing Complexity

XML specifications are much easier to create than EDI formats have been, and there are many independent initiatives creating 'fit for purpose' XML-based systems. These specifications are competing for market share, and, with the rapid development of enabling technologies, this competition will naturally increase. While greatly improving our ability to express information in a meaningful and powerful way, this trend does exacerbate problems related to exchanging data between disparate systems.

These problems should be addressed at two levels:

  • common data structures (hierarchies, rules, etc.) should be shared and re-used as widely as possible to minimize the need for data conversions (most conversion processes are one-way - you can get the data there but you can't get the same data back)

  • technologies to handle data conversion (e.g., meta-data repositories, XSLT) should be incorporated into system architectures, such as defining a local XML specification for application use while supporting standardized XML interchange after XSL Translations.

Steps towards Standardization

Eventually, the degree of standardization employed by a community of data users will determine the efficiency of electronic data exchange for either traditional EDI or XML-based systems. Data interchange, electronic or otherwise, is at heart the communication process - if what the sender intends is not what the receiver understands, the process has failed and costs increase for everyone involved.

Useful XML standardization may not require a universal set of hierarchical tags, but rather a consistent means to transform XML content between dialects and schemas. Minimizing the number of dialects and schemas is helpful, however. Steps towards these goals should include:

  • when building new XML schemas, recognize that each is one part of a family of data content specifications rather than an isolated work
  • placing new XML specifications in publicly accessible forums
  • encourage business partners (and accept encouragement) to participate in relevant standardization efforts

References:

[i] St.Laurent, Simon: File size concerns grow, <?xmlhack?> 11 May 2000 (URL: http://xmlhack.com/read.php?item=506)

[ii] Gardner, Elizabeth: XML Seen as Key to Boosting Electronic Data Interchange, Information Week June 1, 1998 (URL: http://www.internetworld.com/print/1998/06/01/ecomm/19980601-xml.html)

[iii] EXPERTS (EDI/XML Procurement Enabling Real Trade Standards) (previously available at www.ilc.at)

[iv] XML and EDI: Peaceful Coexistence, whitepaper by XMLSolutions Corporation (URL: http://www.xmls.com/resources/whitepapers/co-existence.pdf)

[v] Kerstetter, Jim: XML holds promise as EDI replacement, PC Week Online 05.04.98 (URL: http://www.zdnet.com/pcweek/news/0504/04xml.html)

[vi] Intel: Intel To Introduce Groundbreaking New Products For Transacting Business Over The Web, Press Release 8 May 2000 (URL: http://intel.com/pressroom/archive/releases/fe050800.htm)

[vii] Vijghen, Philippe: Cost-Effective EDI Using XML? A Pivot-Oriented Approach (URL: http://www.acse.be/papers/edixml-body.htm)

[viii] Booker, Ellis: XML Applications Stand Up To EDI, InternetWeek Apr 16, 1999 (URL: http://www.internetwk.com/story/TWB19990416S0002)

[ix] Gardner, Elizabeth: XML Seen as Key to Boosting Electronic Data Interchange, Information Week June 1, 1998 (URL: http://www.internetworld.com/print/1998/06/01/ecomm/19980601-xml.html)


copyright© 2000 by POSC. All rights reserved.
originally published 1 June 2000
updated 23 Mar 2001