[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Questions on XMI
Terry, Thanks for the writeup. I have some responses which I think should clarify your questions: 1) The <XMI> tag serves two purposes: It marks the start of XMI within an XML or HTML stream (for example, XMI messages embedded in SOAP protocol packages - see SOAP4J) and also provides a standard place for declaring Namespace, URI, and Version information. You need to be able to check the URIs and Versions before reading a message in order to understand what the message is. This mechanism has to be generic for a wide range of messages, and the simplest way of satisfying these requirements is to have a standard outer tag that contains the message. 2) Since DTDs have the well-known restriction that order and multiplicity must both be specified, experience has shown that it's counter-productive to impose ordering requirements on models that do not specify an order. (An example is three classes with associations between them.) Imposing an order means that valid models would need to modified to permanently incorporate an order before DTDs could be generated - otherwise there would be no guarantee that generated DTDs from the model would always be the same. The choice is either to modify valid models by adding a new order and finding a way to reincorporate that order back into the model, or to reduce the ordering requirements. Since this is an issue only within DTDs, it makes little sense to modify the models. 3) The Doctype is one mechanism for identifying versions and models, but the declaration of the Namespaces and URIs in the XMI element provides complete information and is more aligned with the overall XML direction. 4) The XMI extension mechanism has proven to be extremely valuable when sharing information in integration scenarios. Fundamental operations such as round-trip exchange and difference/merge between existing software applications are very difficult without extensions to provide a mechanism for bridging the application's internal view back to the external DTD representation. Since extensions may always be ignored by the document reader, they pose minimal difficulty to avoid. The entity extension mechanism below is similar to the one used in the CWM (data warehouse) XMI DTDs. In XMI DTDs, you can always declare entities as long as the fully expanded form is correct. I think these 4 questions are fairly minor, especially when you consider what you get in return: an open and supported standard for general object interchange with wide applicability to some of the most important software domains: designs (UML, ER), languages (Java, IDL), and databases (CWM). The list is growing quickly because the cost of adding a new DTD is so low. Draw a UML model, write an interface, or import a database schema, and you can generate an XMI DTD with freeware tools. XMI is directly applicable to the ebXML work being produced in several domains. As ebXML groups create models, such as business process for example, they can generate the XMI DTD directly from the specification. The hard part is getting the model right - after that the XMI DTD falls right into place. Thanks, -Steve Stephen A. Brodsky, Ph.D. Software Architect Notes Address: Stephen Brodsky/Santa Teresa/IBM@IBMUS Internet Address: sbrodsky@us.ibm.com Phone: 408.463.5659 Terry Allen <tallen@sonic.net> on 02/09/2000 01:41:56 PM To: Stephen Brodsky/Santa Teresa/IBM@IBMUS cc: "Iyengar, Sridhar" <Sridhar.Iyengar2@UNISYS.com>, scott.nieman@norstanconsulting.com, Terry Allen <tallen@bolt.sonic.net> Subject: Re: FW: Questions on MOF examples for XMI Here's the write-up I promised; ignore the stuff about Docbook if you're pressed for time. Difficulties in using XMI 1.1 for OASIS and EBXML Registry Documents Disclaimer: This document is concerned with the generation of DTDs to define document types for which document instances may be composed independently of any UML or MOF implementation. That is not the goal of XMI 1.1, which is intended as a transfer syntax. So the problems noted here arise from the use (or prospective use) of XMI outside its intended domain of applicability. And I really do appreciate the work that's gone into XMI! Problem Statement OASIS's Registry and Repository TC has developed a specification for implementing an ISO/IEC 11179 registry for a repository of SGML-and XML-related entities. This specification includes DTDs for specific documents required in the registration process. These DTDs have been composed by hand, and reflect the present 11179, without the proposed X3.285 revision, which is specified using UML models. EBXML's Registry and Repository WG is developing a UML model of use cases for a registry and repository for commerce-related UML models and related entities (including XML-related entities). It is desired to extend the OASIS specification to meet EBXML's needs. Hence it would be desireable to replace the hand-composed OASIS DTDs with DTDs generated from EBXML's UML model of use cases. (Just how to generate the specific documents required is an issue I won't consider here.) To add to this mix, it appears that X3.285 may be reaching the point at which it could be put into practice; if that were possible, it might be that much of the work of the OASIS and EBXML groups has been done already, and we could just generate the DTDs we need, using XMI's DTD Production Rules. Unfortunately, DTDs generated using XMI's DTD Production Rules are not sufficiently constrained to allow validation of documents composed outside of an environment that enforces the UML model they (partially) express - and for OASIS, at least, we cannot assume such an environment. Put another way, any number of documents can be composed that will validate against a DTD generated using XMI's DTD Production Rules, but that do not contain the information required by the model (contra XMI 1.1, 6.2, second para). Specifics 1. The XMI DTD (I used ad/99-10-05, XMI 1.1 RTF UML DTD) is too loose to begin with. Even the degenerate case <XMI> </XMI> is valid, as none of the XMI element's contents is required. (Even the CWM DTD, which requires XMI.header, does not require anything else, including all the real content, although it is an invalid DTD due to ambiguity.) In addition, the liberal use of the ANY keyword, while well intentioned, permits nonsensical combinations of declared elements. (See below for remarks on a better extension mechanism for DTD syntax.) 2. The generated DTDs are much looser than the models because of the design decision to eschew the + and ? repetition operators (e.g. those in Appendix C of the MOF 1.3 document). 3. For OASIS and probably for EBXML purposes, it is not useful for the doctype of instances always to be XMI. What is wanted is a DTD at the level of content shown in the Letter DTD of MOF 1.3 C-153, for use independent of the XMI.content element. (Invariably, programmers want to identify the document type from the root element; the references to the model, metamodel, and anything else could be conveyed by FIXED attributes within the generated content DTD if needed.) 4. The extension mechanism, using the XMI.extension element, is much better thought out than such mechanisms generally are, but presents certain difficulties in itself: - It probably shouldn't be present in generated DTDs at all. - When added to a PCDATA content model it produces mixed content, which SGMLlers can live with but programmers new to XML find all but impossible. - As the content model of XMI.extension is ANY, an extension declared for use in a specific context can appear anywhere, within an XMI.extension element. - Most XMLlers would rather not have to deal with the XMI.extension container element around the actual extension. - BTW, what is the difference between XMI.extension and XMI.extensions? The XMI 1.1 spec isn't terribly clear. So, What To Do? We could generate the desired DTDs, extract the parts that aren't boilerplate XMI, and rework the content models (perhaps by algorithm and Perl) to tighten them up to the point they reflect the model. We could define a different set of DTD Production Rules and build software to implement them. We could go back to XMI 1.0 and see if that works better. We could require that OASIS and EBXML registry documents be produced by software that enforces the rules of the model rather than of the generated DTD, and that they be consumed by software that acts similarly. We can state the problem and invite other solutions (which is what I'm doing). Another Extension Mechanism, Really Just FYI Now, in XML Schema different extension mechanisms will be available, and once we have XML Schema we may not care about these DTD syntax matters. However, if a extension mechanism meeting the above objections is desired, the Docbook DTD mechanism, using parameter entities, might be considered (credit Eve Maler of Sun for much of this parameterization). You may not want to go here, but just for example (in SGML syntax, easily converted to XML without affect wrt the issues discussed here): <!ENTITY % term.module "INCLUDE"> <![ %term.module; [ <!ENTITY % local.term.attrib ""> <!ENTITY % term.role.attrib "%role.attrib;"> <!ENTITY % term.element "INCLUDE"> <![ %term.element; [ <!ELEMENT Term - O ((%para.char.mix;)+)> <!--end of term.element-->]]> <!ENTITY % term.attlist "INCLUDE"> <![ %term.attlist; [ <!ATTLIST Term %common.attrib; %term.role.attrib; %local.term.attrib; > <!--end of term.attlist-->]]> <!--end of term.module-->]]> The outer INCLUDE parameter entities can be overriden by an exterior DTD subset so that substitutes can be provided; %term.role.attrib; is a user-customizable attribute; the important part here for attributes is %local.term.attrib;, which is defined as empty but can be redefined in an exterior subset. For the element's content model, %para.char.mix; is declared as <!ENTITY % local.para.char.mix ""> <!ENTITY % para.char.mix "#PCDATA |%xref.char.class; |%gen.char.class; |%link.char.class; |%tech.char.class; |%base.char.class; |%docinfo.char.class; |%other.char.class; |%inlineobj.char.class; |%synop.class; |%ndxterm.class; %local.para.char.mix;"> note the %local.para.char.mix; parameter entity; if declared in an external subset it would have to begin with a | to make the syntax work correctly. A document employing a customization layer, as we call it for Docbook, can include that layer in an internal subset (sometimes useful but may cause difficulties for an uninformed recipient) or can reference that layer as an external subset in the DOCTYPE declaration; the external subset must then reference the Docbook DTD (makes clear that the document conforms to an altered version of the DTD). Unfortunately, you can't play this trick with sequences for syntactic reasons: <!DOCTYPE foo [ <!ENTITY % foo.extension "proper.content"> <!ELEMENT proper.content (#PCDATA)> <!ELEMENT foo (proper.content, (%foo.extension;)?)> ]> <foo> <proper.content>bar</proper.content> </foo> is valid in SGML, but not in XML, and either way, if foo.extension is declared as <!ENTITY % foo.extension ""> the result is invalid. In Docbook we found that anything that needed extension could either be overriden or has a *'d content model that can be extended. regards, Terry
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC