XML and UML are both important technologies and their optimal integration requires some analysis. We discuss our point of view on integration of XML schemas and UML class diagrams. Our view is that either
This note is organized as follows. First we compare XML schemas and UML class diagrams and then we sketch translation algorithms in both directions.
defines object set | defines document set | plays grammar role | |
XML schema | partially*** | yes | yes |
UML class diagram | yes | no | no |
Because both XML schemas and UML class diagrams are defining sets of objects, the natural question arises: how should an organization manage its XML schema repository and its UML class diagram repository? From the above discussion it becomes apparent that there is a danger of significant overlap between XML schemas and UML class diagrams that are needed for the same application.
Because XML schemas are more expressive than UML class diagrams, it would be natural to start the modeling with XML schemas. There are however currently some deficiencies in the XML schema notations that don't make it an ideal object modeling notation. But there are workarounds available and we expect those deficiencies to disappear over time (written in May 2000). It should be noted that the XML schema notation should also be used to describe "functional" objects, like visitor objects, that are not directly related to business concepts. The XML schema should be written with the intent that it will be used to implement the functionality of the application and not just to describe the structure of the business data.
XML schemas have the essential capabilities to model class structures. The essential capabilities are:
<!ELEMENT Compound (Op , Exp , Exp )>
<!ELEMENT Simple (Number )>
<!ELEMENT Exp (Simple | Compound )>
<!ELEMENT Op (Add | Mul )>
<!ELEMENT Number (#PCDATA )>
<!ELEMENT Add (#PCDATA )>
<!ELEMENT Mul (#PCDATA )>
The following is an example of a document that describes the prefix expression * 3 5 (3*5 in ordinary notation).
<Compound><!-- (Op , Exp , Exp )-->
<Op><!-- (Add | Mul )-->
<Mul>*</Mul>
</Op>
<Exp><!-- (Simple | Compound )-->
<Simple><!-- (Number )-->
<Number>3</Number>
</Simple>
</Exp>
<Exp><!-- (Simple | Compound )-->
<Simple><!-- (Number )-->
<Number>5</Number>
</Simple>
</Exp>
</Compound>
The diagram shown above is very close to a UML class diagram for representing prefix expressions. The nodes are classes and the edges show relationships between classes. The rectilinear connections show directed associations from left to right. The connections from Op to Add and Mul and from Exp to Simple and Compound are inheritance edges. There is one detail missing (besides the missing edge from Compound to Exp): the association ends are missing in the schema. We would like to say that a Compound expression has two subexpressions called argument1 and argument2 and both being of type Exp. But unfortunately, we cannot express this in the schema while it can easily be expressed in a UML class diagram. The workaround would be to introduce two extra elements, called Argument1 and Argument2 und to define them to contain an Exp. This introduces two extra nodes and two extra edges in the schema which is not so nice. We call this problem the PartNaming problem of XML schemas.
Besides the PartNaming problem that creates systematic differences between XML schemas and UML class diagrams there is the ObjectLinking problem that also creates differences. Consider the following XML schema that describes a network of partners using a graph structure with labels on edges (LinkInfo) and nodes (PartnerInfo).
The above schema defines documents that define Partner structures referring to partners using the PartnerId in the PartnerLinkInfo objects. In a UML class diagram, we would like to represent the Partner structures as a linked structure which means that PartnerId in PartnerLinkInfo should be replaced by Partner.
XML schemas have the essential capabilities to model class structures. They can be translated to UML class diagrams by using the following systematic process:
It is useful to translate UML class diagrams to XML schemas provided the UML class diagrams have been written with the purpose in mind that they will play a grammar role. Diagrams written with such an intent can be easily translated. What are the restrictions that a UML class diagram must satisfy so that it is easily translated into an XML schema.
Given the current state of the art of XML and UML technology, it seems useful to develop an integration of XML schemas and UML class diagrams. The combined notation should start either with an XML Schema Notation or the XMI notation (or similar notation) for class diagrams and extend it with the missing information. If we start with XML schemas, we need to add part names. If we start with UML class diagrams we need to add ordering information and we need to follow a certain style.
Because it is easier to add parts to an XML schema notation than to
add more information to a UML class diagram, we prefer to take XML schemas
as the starting point of a design. But to start with UML class diagrams
also makes sense.