[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: Tag Languages, UID's etc.
Arnold, I guess I wasn't clear. I don't think (and Hartmut can correct me if I'm wrong) that we will be using the ID as a tag. The tag, in my opinion, must be human readable. I also think it should be clear and short. The ID would be in the dictionary, or lexicon, or catalog, or whatever we are calling it these days. For example, if the ID were 1234567, the English dictionary name might be FinancialAccountIdentifier. It would be something different in French, Japanese, German, etc, but the ID would still be 1234567. We X12/EWG types could also figure out which of our data elements was 1234567. And folks from OAGI or RosettaNet or any other industry consortium could also match to the ID, based on the definition. I think that's what syntax neutrality is all about. Have no idea what the 'tag' would be, but something meaningful like 'BankAcct'. The business people from the financial arena are probably best at choosing that. If what we want is to bring the small, non-EDI enterprises into the fold (so to speak) we have to realize that many of them will start out just viewing the XML on their computers. A meaningless tag, or a very long one, will be useless to them. I bet this will start another round, huh? MK -----Original Message----- From: Arnold, Curt [mailto:Curt.Arnold@hyprotech.com] Sent: Thursday, January 25, 2001 2:43 PM To: 'ebxml-core@lists.ebxml.org' Subject: RE: Tag Languages, UID's etc. I would actually hope the working groups are only paying lip service (I had already typed that term before Mary Blantz's comment) to the UID concept, since it seem antithetical to several key XML design principles. If ebXML is really going to be UID driven, then the name should be changed to eb[something other than XML] 1. The tag name becomes just a comment. All the other XML infrastructure uses the namespace qualified tag name as the primary means of declaring meaning and allowable structure. There is no mechanism, for example, in XML schema to match a content model to a specific value of an arbitrary attribute such as UID. 2. Interpretation requires either: a) fetching an external resource Fetching an arbitrary DTD to provide the UID's to enable a message to be interpreted is unacceptible. All sorts of denial of service attacks could be launched by throwing messages with spurious DTD's at a server (per David Megginson's "When XML turns Ugly" talk at XTech 2000). If you don't dynamically fetch a DTD, then you are then creating an interoperability problem since servers would each have their catalog of known DTD's used to provide tag name <-> UID matching and messages in less prominent languages would not be universally accepted. b) Using the internal subset Which few processors will build. For example, XSLT won't build an internal subset. c) Putting the UID explicitly on each element This is case, if you have: <!-- Invoice is just an comment --> <Invoice UID="{208AA0C4-8612-4327-823C-784278F0D0BE}"/> Why not just format uid so that it is a valid name and do: <!-- Invoice --> <_208AA0C4-8612-4327-823C-784278F0D0BE ...> Then at least you can do schema-based validation. 3. Locale favoritism is attacked at the cost of making the messages hard to comprehend in all locales. One of the design goals for XML was that it should support human legible documents. For a human to interpret an UID based document, you have to 1) read the tag name, 2) look up the tag name in the DTD to find the UID, look up the UID to find the meaning. The combination of a URI and an XML name ("http://www.ebxml.org/Namespace/Purchasing" + "invoice") is sufficient to provide the link to a "wealth of semantic information" using RDF and other existing technologies. The value of localizing machine-to-machine communications is lost on me since the processing infrastructure (programming languages, operating systems, etc) already have a strong English bias and programmers will already have some familiarity with English. However, as an native English speaker, I would much rather have the tag name for an invoice be <ebxml:Rechnung>, <ebxml:Facture>, <ebxml:factura> than <ebxml:_208AA0C4-8612-4327-823C-784278F0D0BE. Just pick some human language as the canonical representation. Then XSLT transforms can be used, when needed, to convert the document to a locale-specific form for human analysis. But the processing systems shouldn't have to be burdened with having 100+ synonyms for every concept.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC