Click to return to home page
White Paper
home   site map   search   contact 

Providing a Complete XML-Based Data Interchange Solution

XML is a new, exciting, and powerful technology, but it is no different from any other technology in that it must be used wisely and in a disciplined manner to be a truly effective part of a data interchange solution. The need for a disciplined approach to XML-based solutions has led Innovision to develop a new concept, XML Protocols, and a new supporting technology, the Nepal XP Framework.

The XML Protocol concept is founded on the following two ideas:

  1. Start with XML as the data format for messages exchanged between network processes.
  2. Augment the power of XML with the features that are required for full-fledged network protocols.

Thus XML Protocols leverage all the benefits of XML, resulting in a virtually unlimited range of applications. An XML Protocol can be created and used in any situation that requires open, standardized exchange of data over a network.

The Nepal XP Framework is a developer suite that allows developers to define an XML Protocol and quickly create and deploy high-performance applications and systems which utilize that protocol. The two most important benefits of the Nepal XP Framework are:

  1. It permits the developer to formally specify each XML Protocol, including both the format for the messages to be exchanged and the rules governing how the messages are exchanged.
  2. It handles all the details of parsing the XML-encoded messages, which relieves the developer of the need to know all the details of XML and allows her to focus on the actual problems that need to be solved.

The purpose of this document is to further define and clarify both the XML Protocol concept and the Nepal XP Framework. The structure of this document is as follows:

  • Introduction. This section.
  • XML Protocols: Taking XML to the Next Level. Further defines the XML Protocol concept by enumerating and explaining each of its features.
  • The Nepal XP Framework: Making XML Protocols a Reality. Further defines the Nepal XP Framework by enumerating and explaining each of its software components.
  • Final Remarks. Some comments about the future direction of XML Protocols and some concluding remarks.
  • Appendix: Brief Introduction to XML. For those unfamiliar with XML, this appendix provides a brief introduction to its concepts, terms and syntax.

XML Protocols: Taking XML to the Next Level

XML's use of markup to organize and tag information yields structured information that is ripe for interchange between two or more parties, but XML itself stops short of specifying how this interchange is to take place. This is where XML Protocols step in to take XML to the next level: by providing for the robust, reliable interchange of XML-encoded information via the Internet. The XML Protocol concept augments the power of XML with the following features:

  • Data Abstraction. XML Protocols introduce the notion of a Protocol-Specific Object Model (PSOM) for conceptualizing the information that is to be interchanged. This relieves developers of the need to know the details of XML and provides a more concrete data model than existing XML tools do. This allows developers to quickly create reliable, high-performance XML applications.
  • Data Validation. XML validating parsers validate the structure of an XML document, but do not in any way validate the content, or data, of a document. XML Protocols add the capability to validate the various data values that comprise the content of an XML document.
  • Out-of-Band Data. In some applications there is a need to exchange data that is not properly a part of the problem domain. Such information is known as "out-of-band" information, and XML Protocols provide a framework that supports such information.
  • Interchange Models. Data can be interchanged between two parties in several ways. One common way is the request/response or client/server model, wherein a "client" requests information from a "server". XML Protocols provide for this model and for others. This allows the developer to use the model that best fits the application.
  • Protocol Management. XML Protocols allow for a unique version number to be assigned to each protocol to help in the management of multiple protocol versions.

These topics are discussed in more detail in the following subsections.

Data Abstraction

Even though XML-encoded information can be very straightforward and simple, processing it robustly does require that the developer fully understands the details and rules of XML. For example, this sample invoice is marked up in XML (the reader who is unfamiliar with XML is referred to the Appendix):

 
<Invoice>
   <To>John Doe Widget Consumer</To>
   <From>Acme Widgets</From>
   <LineItems>
      <LineItem>
         <Qty>2</Qty>
         <Desc>Small Widget</Desc>
         <Each>20</Each>
         <LineTotal>40</LineTotal>
      </LineItem>
      <LineItem>
         <Qty>1</Qty>
         <Desc>Big Widget</Desc>
         <Each>50</Each>
         <LineTotal>50</LineTotal>
      </LineItem>
   </LineItems>
   <Subtotal>90</Subtotal>
   <Tax>5</Tax>
   <InvoiceTotal>95</InvoiceTotal>
</Invoice>

The markup here is simple, clear, and concise. Each significant piece of information is tagged so that it can be easily identified both by human readers and by software. A developer could easily write code that could, for instance, get the invoice total. This would be a matter of simply scanning for the string "<InvoiceTotal>" and then harvesting the value "95" that follows it. This, however, assumes that the invoice being received has all the correct and required tags and values. What if the vendor's software that generated this invoice used the tag "<Total>" instead of "<InvoiceTotal>"? What if the value is missing or is not a valid number? What seemed simple is not quite so simple anymore. To further complicate matters, most XML applications make heavy use of attributes within tags. For example, the currency used for the invoice total might appear as an attribute like this:

  <InvoiceTotal Currency="USD">95</InvoiceTotal>

The bottom line is that to really do a robust job of parsing an XML document, a developer must also take into account the Document Type Definition, or DTD, that defines the elements and attributes that the document can contain (See the Appendix for more information). Suddenly, what started out as a simple problem of scanning text has turned into a complicated parsing problem that requires more and more of the developer's time. This is why many developers turn to off-the-shelf validating parsers, such as Sun's ProjectX parser. A parser scans through the document and gathers up all of its elements, ensuring that all the tag names are correct and that all of the required tags are present. Most parsers can return the content in generic DOM (Document Object Model) tree form. DOM is a recommendation issued by the W3C. It provides an object-oriented programmatic interface to XML and HTML documents. For example, the invoice shown above is represented as the following DOM tree:

{short description of image}

An in-memory tree such as this one relieves the developer of the responsibility of scanning through the XML text and allows her to simply traverse the tree to find the values instead. Moreover, since the parser has validated the document against the DTD, the developer can assume that all of the tag names in the tree nodes are correct and properly nested. However, the developer must still traverse the tree and do string comparisons to find the desired node. While this is fairly straightforward, it can become cumbersome for complex documents.

This is where the data abstraction provided by a Protocol-Specific Object Model (PSOM) gives a developer the leverage to implement applications quickly and reliably. Instead of being forced to view an invoice as a tree of generic nodes, the developer may view it as an invoice object as shown in the following diagram:

{short description of image}

It is straightforward for a developer to create this PSOM object in Java. All that is required is a data stream with the XML text on it (call it xmlStream) and the PSOM classes created by the PSOM Generator. To create the invoice object, the developer first constructs an instance of the PSOM Invoice class and then invokes the parseXML method:

  Invoice invoice=new Invoice();
inovice.parseXML(xmlStream);

Each such PSOM object conforms to a standard interface, allowing the user to access the information stored in that object via access methods. For example, to get the value of the InvoiceTotal element, the developer simply codes the line:

  float invoiceTotal=invoice.getInvoiceTotal();

This will place the invoice total from the invoice object into a variable called invoiceTotal. This variable is of type float, allowing the developer to operate on it immediately.

By contrast, if that same developer were using DOM, he would have to traverse the DOM tree to find the node with the tag "InvoiceTotal". Once he found the node, he would have to extract the content of that node, which would be the invoice value in the form of a string. This string would then have to be converted to floating point format before the developer could operate on it in any way.

The developer may also set the invoice total with a single line:

  invoice.setInvoiceTotal(invoiceTotal);

In the DOM scenario, this same operation would require the developer to create and populate the appropriate DOM nodes, linking them together in a way that was consistent with the DTD. Note that the developer gets no help from DOM in creating trees that are consistent with the DTD. This requires the developer to have an intimate knowledge of the DTD in order to perform what should be the simple operation of setting the invoice total. In addition, even if the developer has intimate knowledge of the DTD, this process is error-prone, and even the best developers may create DOM trees that do not match the DTD. Further, DOM implementations require the developer to set the element values in a particular order, but PSOM allows the developer to set the element values in any order. This relieves the developer of having to know about element ordering that is defined in the DTD.

Clearly, the PSOM invoice object allows the developer to operate in the problem domain, thus reducing the amount of extra work that he needs to do to access the XML-encoded information. It also relieves the developer of the chore of having to understand the structure of the underlying XML document. This not only speeds development time, but also reduces the number of XML-related coding errors to zero. Furthermore, PSOM ensures adherence to the formal specification of the XML Protocol, and to the XML specification.

In addition to the "get" and "set" accessor methods shown above, each PSOM object also has the following types of methods:

  • Validation Methods. Help the developer know if the object he has created is valid with respect to the DTD. At the PSOM object level, this involves ensuring that all required values are set. For example, all of the elements except for Tax are required within an Invoice object. So, if any field besides Tax has not been set, the object will not be valid. In addition, XML Protocols add the notion of data validation to XML's notion of document type validation such that the data values are also validated, e.g. that the invoice total is numeric.
  • Metadata Information Methods. Allow the developer to find out what elements or attributes are permitted, required, etc. inside the given object.
  • Traversal Methods. While PSOM relieves the developer from having to know about the structure of the underlying XML-encoded information, it does not prevent the developer from having access to it if he needs it. The traversal methods allow pre-order and post-order traversals of the underlying information via the "visitor" pattern. One application of this is to extend the functionality of all PSOM objects without modifying the PSOM source code.
  • Searching Methods. These methods allow for pattern-matching, recursive searches on the values of elements and/or attributes.
  • SAX Event Generation and Parsing. The SAX (Simple API for XML) event generation methods allow an object to serve as a source for SAX events and therefore be used by applications that support the SAX interface. The SAX parsing methods allow an in-memory object model representing a document or document fragments to be created using any XML parser that supports SAX.
  • Serialization. Provides the developer with space-efficient support for Java serialization.
  • Equality. Allows the developer to determine whether one PSOM object is equivalent to another.

Of these methods, the SAX-related ones deserve further discussion. SAX is the standard interface designed for programmatically processing XML documents in an event-based manner. It allows the developer to view the XML document as a stream of events, rather than as a stream of text or as a hierarchical collection of objects. Although it has most frequently been used as a standard interface to allow for interchangeable parsers, it may also be used in many other ways. For example, the Nepal XP Framework provides for SAX-to-XML document handlers that convert a stream of SAX events to a textual XML document. This means that a SAX-to-XML document handler may be written to generate the XML in any form desired. It also means that the PSOM objects themselves do not have methods for actually formatting the XML text. Here is an example of how SAX-to-XML document handlers may be used to output textual XML:

  PrintWriter pw=new PrintWriter(System.out);
FormattedSAXWriter saxWriter=new FormattedSAXWriter(pw);
invoice.setDocumentHandler(saxWriter);
invoice.generateSAXEvents();

Data Validation

A validating parser paired with an XML DTD enforces the structure of XML-encoded information. What a validating parser cannot do is validate the data itself. For example, the InvoiceTotal element in the Invoice DTD (See the Appendix for more information). This element is defined as follows:

 
<!ELEMENT InvoiceTotal (#PCDATA)>

This element is defined to contain character data, but there are no constraints on what this character data may be. As far as the DTD and the validating parser are concerned, the InvoiceTotal element could contain "23.50", "George", or "abc". This situation is by no means unworkable, but it does mean that a developer would have to write code to validate all the content of every XML document processed. This clearly would require a substantial amount of work and could prove to be error-prone.

Since the W3C's release of the XML 1.0 Recommendation, there have been several efforts to address this issue in a well-defined, standardized way. Most of these efforts have involved superseding the XML DTD with some type of schema language that would not only express the structural information that DTDs already express, but would also allow the user to express the constraints on the character data that the XML document contains.

Innovision is watching these developments closely and will take appropriate action on them as necessary. This is, however, a very important issue to address, and since it is a part of the notion of an XML Protocol, Innovision has developed an interim solution called DFX. DFX allows the developer to define the data constraints via a DFX Specification. A DFX Schema Specification works hand-in-hand with an XML DTD to provide data validation on the element structure that the DTD defines. It does this by allowing each PCDATA element in the DTD to be associated with a metatype, which specifies the constraints on the data that the PCDATA element contains. Each metatype is also defined in the DFX Specification, where it is given a name and a description. Each metatype serves not only to define the type of the data, but also to constrain the data values in some way. The simplest ways to constrain data values are as follows:

  • Data Length. The minimum and/or maximum lengths of the character data may be specified. For example, a part number description may be required to consist of at least ten characters, but no more than thirty.
  • Data Value. For numeric values, the minimum and/or maximum values of the number may be specified. For example, the quantity field of a line item may be required to be greater than zero.
  • Enumeration. XML DTDs allow for the enumeration of attribute values, but not for the enumeration of PCDATA values. DFX Schema specifications allow for both the enumeration of attribute values and the enumeration of PCDATA values. For example, the color of an inventory item may be restricted to "black", "white", or "red".

These simplest ways are specifiable within DFX itself, but clearly do not cover the entire universe of data value constraint that is possible. For example, it is not possible to constrain a data value to be a valid date with these simple techniques. For these more complex cases, DFX allows a Java class to be associated with each metatype. This Java class implements an interface, and an object of this class will be invoked to actually perform the data validation.

A DFX Schema Specification is fed into the PSOM Generator along with its associated DTD. The resulting PSOM classes have the built-in ability to validate both the required elements of the objects created from those classes and the data constraints of those objects. For example, to validate an invoice object, the developer simply codes the line:

  invoice.validate();

In the case of an invalid object, this method will throw an exception that describes the error. Note that validation may be done at any level of granularity, and that it need only be used once, and not each time the object is modified. This is another example of how XML Protocols manage the details of XML for the developer. With DFX Schema Specifications, the developer is relieved of the responsibility to validate the data values, allowing the developer to focus on the problem he is actually trying to solve.

Support for Out-of-Band Data

The previous three sections were concerned with the data that is to be exchanged with a particular XML Protocol. XML Protocols clearly support the robust, secure exchange of this data, and while it makes up the bulk of the data that XML Protocols support, it is not the only data supported. There are some cases in which out-of-band data may be useful in some applications. Out-of-band data is data that is not properly a part of the problem domain data set, but nonetheless needs to be communicated. For example, it may in some cases be desirable for a client to pass along some sort of "application type" to a server. This would allow the server to take different courses of action for different applications that communicate with the same protocol. Although it might be tempting to include this application type in the XML message elements, it is not a part of the problem domain data set. Rather, it is an artifact of choices made during system design and/or implementation. For this reason, it is more properly considered to be "out-of-band" data, and should be communicated outside of the XML-encoded data. XML Protocols make a provision for this scenario, much as XML itself does with its processing instructions.

Interchange Models

The term "Protocol" in XML Protocol implies that there is more involved than just data. XML Protocols add value not only at the XML data level, but also add value with regard to how the XML data is exchanged. A part of this is accomplished via interchange models that define the roles and responsibilities of the network processes involved in the data exchange and define the way that these network processes work together to exchange data smoothly. A particular interchange model governs and constrains the specific set of procedure rules that regulate the exchange of messages between processes. The Nepal XP Language permits the specification of these rules.

There are currently three defined interchange models, Request/Response, Client/Server, and Harvester, which are presented in the balance of this section. Note that nothing in the XML Protocol architecture prevents other interchange models from being added in the future.

The Request/Response Interchange Model

The purpose of Request/Response is to allow for a particular process to be successfully queried by one or more other processes in an orderly fashion. There are two classes of processes in this model, the requester and the responder, which are defined as follows:

  • The requester is the process that makes the request of the responder and expects a response from the responder. There may be one or more requesters per responder, and a particular requester may also make requests of multiple responders.
  • The responder is the process that receives requests from requesters, services them, and constructs a response that it sends back to the requester that sent the original request.

The Request/Response model is useful when one or more distributed users wish to retrieve information from a centrally-located information source. Serving up this information as an XML Protocol would mean that any application that understood that protocol could easily retrieve information from the centralized source. Such applications would send a single, XML-encoded request to the centrally-located responder, and the responder would send back a single response to each request.

The Client/Server Interchange Model

The Client/Server interchange model is similar to the Request/Response model, except that the Client/Server model allows for extended exchanges between processes. Thus, rather than having a single request/response pair, there may be multiple requests and responses in a particular client/server session. Moreover, these requests and responses may have ordering constraints. For example, an extended exchange may start with a request of type reqx, which requires a response of rspx from the server. The next request must be of type reqy or reqz, but not any other type, and the server must respond with rspy or rspz, but nothing else. Finally, this exchange must end with reqw, which requires a response of rspw. This is where procedure rules are the most useful. They allow for the formal specification of message ordering to ensure that the messages are sent in the correct order during the Client/Server session.

The Harvester Interchange Model

The purpose of Harvester is to allow for information to be collected from some common pool in a robust, reliable manner. There is one type of process in this model, the harvester. The pool is not a network process, but represents a collection of resources that the harvester may collect.

Harvester is applicable in cases where there is some common pool of information, part of which needs to be collected for one reason or another. One example is a network in which nodes on that network occasionally broadcast information that may or may not be of interest to other nodes on the network. In this case, the nodes that are interested in the information are not requesting it from any particular source, because they do not know which node might "own" the information in which they are interested. In this case, it is better for any interested nodes to simply "listen" to the network for any information they might find interesting.

There are other situations in which this is useful, but the main thing that determines the use of harvester is that the information to be collected is unsolicited.

Protocol Management

Regardless of how well designed and specified any protocol might be, there will always be the need for additions and changes, and XML Protocols are no different. Introducing additions and changes to a protocol can be problematic if not all processes using the protocol can be upgraded at the same time. This is why XML Protocols have the notion of protocol management. Protocol management ensures that protocols can be upgraded smoothly. Protocol management lends itself to a highly constrained environment, where upgrades to protocols must be done in a very controlled fashion. At the same time, protocol management also works well in a more dynamic, fast-paced environment, where changes and upgrades may happen frequently and without much control. One feature of protocol management is the notion of protocol versioning. This involves assigning a unique number to each version of a particular protocol. This allows each process to specify the version of a particular XML Protocol that it is using, while at the same time being able to determine what version of the protocol is being used for messages that it is receiving from another process. This allows each process to accept only messages for versions of the protocol that each supports, and also to issue clear diagnostics when there is a protocol version mismatch. This flexibility and adaptability is yet another benefit of XML Protocols.

The Nepal XP Framework: Making XML Protocols a Reality

The XML Protocol concept is a powerful one, but a concept alone does not solve real-world problems. That is why Innovision has developed a new technology called the Nepal XP Framework (for XML Protocol Framework). The Nepal XP framework makes the XML Protocol concept a reality. It allows developers to formally specify an XML Protocol and subsequently generate the tools, documentation, and models that can be used to implement applications and systems utilizing that protocol. In other words, the Nepal XP Framework provides an environment that allows developers to quickly create and deploy high-performance XML Protocols. There are four main components to the Nepal XP Framework:

  • Nepal XP Language. Allows developers to formally specify an XML Protocol.
  • Nepal XP Server. Provides the building blocks for creating XML Protocol servers that utilize the Request/Response and/or Client/Server interchange model.
  • Nepal XP Client. Provides the building blocks for creating XML Protocol clients that utilize the Request/Response and/or Client/Server interchange model and/or XML Protocol harvesters that utilize the Harvester interchange model.
  • Nepal XP Tools. Allow developers to generate the supporting tools and documentation for a particular XML Protocol.

These four components are further described in the following subsections.

Nepal XP Language

The Nepal XP Language component allows developers to formally specify an XML Protocol. What does it mean to formally specify an XML Protocol? As established by Holzman [1991], a protocol specification has the following five distinct parts:

  1. The service to be provided by the protocol
  2. The assumptions about the environment in which the protocol is executed
  3. The vocabulary of messages used to implement the protocol
  4. The encoding (format) of each message in the vocabulary
  5. The procedure rules guarding the consistency of message exchanges

The Nepal XP Language addresses each of these elements. First, the developer is allowed to document the service to be provided by the protocol and the assumptions about the environment in which the protocol is executed. Second, the developer is allowed to formally define the vocabulary of messages used to implement the protocol and the encoding of each message. Since all messages are encoded as XML elements, this means that the developer is able to define these elements using an XML DTD. Furthermore, the user is also able to define constraints on the data contained within each message element using DFX. Note, however, that the Nepal XP Framework will not limit developers to the use of XML DTDs and DFX Schema Specifications in the presence of other industry-accepted alternatives. For example, the W3C's XML Schema Working Group is currently addressing means for defining the structure, content, and semantics of XML documents. This may result in a recommendation that complements or supersedes the combination of DTDs and DFX Specifications. The Nepal XP Language and the Nepal XP Framework as a whole are designed to gracefully allow for such advances in technology. Finally, the XML Language allows the developer to formally define the procedure rules that govern how the messages are exchanged, based on the chosen interchange model.

Nepal XP Server

The Nepal XP Server component provides the building blocks for creating XML Protocol servers. An XML Protocol server can be viewed as a network resource that "serves up" XML-encoded information via some defined XML Protocol. Note that Nepal XP Server is not a server itself, but is a means by which servers may be developed. When used in conjunction with Nepal XP Language and Nepal XP Tools, it allows developers to quickly create and deploy a high-performance XML Protocol server in a manner that is most suitable for the given application. Such servers may be anything from a simple servlet running within the context of an HTTP server to multiple stand-alone processes distributed across multiple hosts. Such scalability gives the developer the control to build the server that is required by the application, not the server dictated by the Nepal XP framework.

XML Protocol servers written with Nepal XP Server can utilize PSOM objects to access protocol messages. This relieves the developer from having to deal with the details of XML, and allows the developer to focus solely on building the required application. Also, since clients written with Nepal XP Client also use PSOM objects, there is good, clean uniformity for those developers who are creating both a server and one or more clients.

Once an XML Protocol server has been deployed, it becomes a communication port through which any other application that uses the same XML Protocol may exchange information. The only requirement is that such applications adhere to the same XML Protocol that the server supports. These applications may be built using Nepal XP Client, may be other XML Protocol servers that are acting as XML Protocol clients, or may even be mass-market desktop applications that adhere to the given XML Protocol but were not built using the Nepal XP Framework. This flexibility allows for a nearly endless array of uses for an XML Protocol server built using the Nepal XP Server building blocks.

Nepal XP Client

The Nepal XP Client component provides the building blocks for creating XML Protocol clients. An XML Protocol client is an application that requests and/or gathers XML-encoded information via some defined XML Protocol. These clients may be stand-alone applications, they may be browser apps, or they may reside in embedded systems. Nepal XP Client browser apps may either be Java applets or may exist as embedded Java objects with JavaScript. Note that this means that Nepal XP Client makes PSOM objects available in scripts. Nepal XP Client also provides support for applications that outstrip the limitation of browsers and must run as a stand-alone processes. Furthermore, Nepal XP Client's small footprint makes it ideal not only for browser applications and stand-alone applications, but also for embedded systems. This scalability permits a wide variety of clients for any particular XML Protocol.

As was alluded to in the previous section on Nepal XP Server, XML Protocol clients written with Nepal XP Client can utilize PSOM objects to access protocol messages. This again relieves the developer from having to deal with the details of XML and provides uniformity between client and server for developers who are creating both.

Once an XML Protocol client has been developed, it can communicate with any XML Protocol server, regardless of whether that processor was developed using the Nepal XP Framework technology or not. Thus, the Nepal XP Framework may be leveraged to quickly develop clients without losing compatibility with servers that were developed without the benefits of the Nepal XP Framework technology.

Nepal XP Tools

The Nepal XP Tools component provides a comprehensive set of utilities that allow developers to generate the supporting tools and documentation for a particular XML Protocol. These tools fall into one of two categories:

  • PSOM class and documentation generation
  • Standardized PSOM application modules

These two categories are explained further in the following subsections.

PSOM Class and Documentation Generation

The PSOM generator is the tool that permits developers to manipulate the concrete objects that XML-encoded information represent, rather than having the developer work at the XML level. To do this, the PSOM generator takes a DTD and a DFX Schema Specification as input and produces a set of Java classes and their associated Javadoc documentation as output. This is illustrated as follows:

{short description of image}

For example, if the PSOM generator were run on the Invoice DTD and its associated DFX Schema Specification, it would produce a set of Java classes and corresponding Javadocs. The generated classes and Javadocs can be used on both the server side and the client side. For developers who are working on both sides, this provides consistency across the board.

These generated classes detect and select XML parsers that are available in the run-time environment. The only requirement is that the parser implements the SAX interface.

The PSOM Generator has several options that allow for several configurations of the same protocol. These different configurations can be optimized for performance, code size, and features. This allows the developer to create a protocol configuration that is optimized for his particular needs. For example, server-side development may require greater performance from the classes, even if this means the class size is larger, while client-side development may require smaller class sizes, even if this means a slight performance hit.

In addition to producing Javadoc documentation, the generator also produces output that describes the XML Protocol object model. This output can be used by diagram generators to visually depict the object model of the XML Protocol. This allows developers to visualize the protocol and aids them in working at the more abstract level of the protocol, rather than the raw XML level.

Standardized PSOM Application Modules

XML stands for eXtensible Markup Language, meaning that it allows the creation of an infinite number of document types, each of which addresses a particular problem domain. This makes XML an enabler that enables the endless creation of new tools and solutions. Since XML Protocols are based on XML, they have the same characteristic, i.e. they allow for the creation of an infinite number of protocols. The Nepal XP Framework technology makes these protocols a reality, and, as a result, the PSOM Generator generates an infinite set of real solutions, which are themselves tools. Therefore, each generated PSOM is itself a tool and a part of the Nepal XP Tools component. This means that the Nepal XP Tools component is not a static set of tools, but an ever-growing array of solutions, bounded only by the imagination of the developers who take advantage of the Nepal XP Framework technology.

Final Remarks

Together, the XML Protocol concept and the Nepal XP Framework deliver a disciplined environment with a firm conceptual foundation that allows developers to quickly create XML-based systems. The range of applications that this environment addresses is virtually unbounded because XML Protocols leverage XML's extensibility to the fullest extent. Furthermore, XML Protocols provide standardized, open solutions that can be used not only within a specific organization, but also across organizations. In fact, XML Protocols are well-positioned to provide industry-wide solutions to multiple organizations. This means that each new XML Protocol will become a shared resource used across organizations, and possibly even across industries. This may lead to the establishment of XML Protocol repositories from which any organization can obtain any XML Protocol that it might need in order to share XML-encoded information with other organizations. These openly-available XML Protocols, then, will naturally become the standard way that organizations will share XML-encoded information. These organizations will find that the only limit to the applications of XML Protocols is the limit of the imagination of the people who design and use them.

Contact Information

For more information, contact Innovision Corporation at http://www.innovision.com/ or by e-mail at info@innovision.com

Appendix: Brief Introduction to XML

XML provides a method of organizing data that is not only standard, but which has been proven through years of use in the SGML community. XML's data format is based on the notion of tags, which identify the various pieces of information that are being organized. The following XML fragment shows how a simple invoice may be "marked up" with tags.

 
<Invoice>
  <To>John Doe Widget Consumer</To>
   <From>Acme Widgets</From>
   <LineItems>
      <LineItem>
         <Qty>2</Qty>
         <Desc>Small Widget</Desc>
         <Each>20</Each>
         <LineTotal>40</LineTotal>
      </LineItem>
      <LineItem>
         <Qty>1</Qty>
         <Desc>Big Widget</Desc>
         <Each>50</Each>
         <LineTotal>50</LineTotal>
      </LineItem>
   </LineItems>
   <Subtotal>90</Subtotal>
   <Tax>5</Tax>
   <InvoiceTotal>95</InvoiceTotal>
</Invoice>

In XML, anything appearing between angle brackets (<>) is a tag. For instance, <Invoice> is a tag. There are two types of tags, start-tags and end-tags. End-tags are differentiated from start-tags via the slash character just before the name in the tag. Therefore, <To> is a start-tag, and </To> is an end-tag. Everything between a matching pair of start- and end-tags (including the tags) is called an element. For example, "<From>Acme Widgets</From>" is an element, but so is:

 
<LineItem>
    <Qty>2</Qty>
    <Desc>Small Widget</Desc>
    <Each>20</Each>
    <LineTotal>40</LineTotal>
 </LineItem>

In fact, the whole invoice, from the start-tag <Invoice> to the end-tag </Invoice> is an element. XML documents, therefore are hierarchical collections of tagged elements.

Much like HTML, the tags are not part of the document. They simply serve to delimit the data and identify its meaning. Unlike HTML, XML does not have a fixed set of elements. A user of XML may define any set of elements he likes, and may assign meaning to them in a way that makes sense for how he will be using the elements. For example, any application domain (e.g. finance, mathematics, e-commerce) can define its own set of standard elements that best serve the purposes of that domain. Once the set of elements has been standardized, all players in a particular domain can build application systems based on this set of elements. If all players are using the same set of elements, and each element has a well-defined meaning, data can be exchanged smoothly.

A set of standard element definitions is fine, but as with any standard, it is only useful to the degree to which it is enforced. What prevents one organization from introducing additional proprietary elements which, though useful for that organization, may "break" the application systems of other organizations? The answer to this question lies in the DTD, which stands for Document Type Definition. Each DTD strictly defines how a document of a particular "type" may be constructed, i.e. what elements a document of that type may contain, and how these elements may be organized. To enforce the DTD, the system processing the XML documents uses a validating parser. The validating parser picks apart an XML document (such as the invoice above) and checks it against the element definitions in the DTD. If the document does not match the defined document type structure, then it is rejected and will not enter the system to be processed.

To illustrate what a DTD actually looks like, consider an invoice to be a "type" of document. In this case, the DTD for the sample invoice shown above would look something like:

 
<!DOCTYPE Invoice [
   <!ELEMENT Invoice (To, From, LineItems, Subtotal, Tax?, InvoiceTotal)>
   <!ELEMENT To (#PCDATA)>
   <!ELEMENT From (#PCDATA)>
   <!ELEMENT LineItems (LineItem+)>
   <!ELEMENT LineItem (Qty, Desc, Each, LineTotal)>
   <!ELEMENT Qty (#PCDATA)>
   <!ELEMENT Desc (#PCDATA)>
   <!ELEMENT Each (#PCDATA)>
   <!ELEMENT LineTotal (#PCDATA)>
   <!ELEMENT Subtotal (#PCDATA)>
   <!ELEMENT Tax (#PCDATA)>
   <!ELEMENT InvoiceTotal (#PCDATA)>
]>

Each line above is an element definition, and each element definition defines what is allowed to be "inside" of that element. For example, each LineItem element must contain a single Qty element, followed by a single Desc element, followed by a single Each element, and, finally, a single LineTotal element. Many of the elements above are defined to contain #PCDATA, which simply means that they contain data, rather than other elements. Note that some element names are "adorned" with a special character, e.g. the Tax element in the Invoice element definition is followed by a question mark. A question mark indicates that the element is optional, or not required. The tax element is optional because some things, such as services, are not taxed. Another example is the plus sign following the LineItem element in the LineItems element definition. This indicates that there may be one or more LineItem elements. Elements that are not adorned are required and may occur only once.

It is important to note that outside the world of electronic documents, the notion of a document "type" gets stretched quite a bit. One DTD might actually define several types of "documents", such as invoices, but it may also define the structure of other things that might not be considered to be documents at all. So, outside the world of electronic documents, a single DTD (or perhaps a handful of DTDs) will generally contain all of the element definitions for a particular domain. For example, there is a single DTD for OFX 1.5.

Bibliography

Holzman, Gerard. [1991], Design and Validation of Computer Protocols, Prentice Hall, Englewood Cliffs, New Jersey, 512 pgs. ISBN 0-13-539925-4 hardcover (USA), ISBN 0-13-539834-7 paperback (international edition). Also available in electronic form at http://cm.bell-labs.com/cm/cs/what/spin/Doc/Book91.html.

Gouda, Mohamed G. [1998], Elements of Network Protocol Design, John Wiley and Sons, Inc. New York, NY, 506 pgs. ISBN 0-471-19744-0.