[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Syntax Free Models
Arafon > > The point of ebXML core components is to define (as closely as possible) the > semantic definitions of reusable data structures, while also defining a way > of describing "context" (within industry, region, business process, etc. - > you understand this bit!) that can allow for local extensions. Your "while also" is why I have been busy trying to separate the messaging sequence from the information set, as these are two separate things that need to be handled separately rather than being mixed together as they are today in EDI message syntax and XML > > Semantics do not, in my opinion, require a concept of sequence. (Not that > you can't use one, but sequence is used for many other purposes, as I will > describe). Agreed > > An example would be an address: > > An address could consist of Line1, Line2, and Line3. In this case, you have > (a) failed to capture the most useful semantic relationship; and (b) > hard-coded sequence into your model. You are making an assumption about the > context (i.e., that the address exists to be printed on an envelope, etc.) > > Instead, we could say that there is an object, Address, that has basic > properties of: IndividualID, StreetAddress, City, State/Province, Country > (to use an incomplete example). I use exactly the same example to explain why the Information Sets in Information Units must be unordered. > > If you understand that these are child properties of Address, and you > understand what each of them is, then you do not need a sequence in your > data model - you can sequence them as desired for any purpose (determined by > the application of the model). You do not need a sequence at Information Set level, but it helps if you have a sequence of Information Sets. > > In EDI syntax and XML syntax both, data is represented in a linear form that > implies hierarchical relationships through relative positioning. This is an > unavoidable aspect of syntax - you have to agree what structure is implied > by position within the linear data stream. In the case of XML, the position > of tags within the sequence of tokens describes hierarchy through > containership. If <A> comes before <B>, then <B> is a child of <A>. EDI does > much the same thing, but uses loops, which imply hierarchy in a very > different way, but one still relying on sequence. To validate either an EDI syntax or an XML syntax you need to be able to check Information Sets conform to a pre-agreed sequence/hierarchy. > > As you have seen, the released W3C Schema Rec (hooray, it's finally here!) > does not contain some of the features that many expected/hoped for. I doubt > that it is the final word, and believe it will undergo some refinement as it > is implemented - that is the stated purpose of this draft. Wrong: the latest incomplete spec claims to be "feature complete" when it is patently not so. In fact this week's version is a disaster waiting to happen. It is so self-contradictory as to be almost unusable. If you don't believe me look at the description of the all element in the new Part 0 and in the XML definition given in Part 1. Part 0 says that this element can only be used at topmost level, and cannot contain nested groups. Part 1 says they can nested within choice, all or element definitions. Henry Thompson claims that the restrictions defined in Part 0 are stated in the text of Part 1, but this is so obscurely written that I have been unable to identify it, or to work our how the constraints can possibly be squared with the data models provided for the XML representation of Schemas. For the time being no-one can rely on being able to use Schemas to validate messages! >I think ebXML can > provide significant requirements as a result of what we find necessary to > implement what we build (hence the value of a reference implementation). What I am trying to do is to identify a small (but hopefully "safe from future attack") subset of the Schema specs that could be buried within the definitions, using transformations to "build a schema" from the definitions. [I'm not conformant with this week's draft if the Part 0 definition of All is correct :-( ] > Regardless, Schema gives us one syntax for describing applications of ebXML > models - XML. We cannot rely on the stability of any single meta-language, > however, and should not design our models to match the limitations of the > implementation tools. In the case of EDI message syntax, we have no formal > meta-language whatsoever. Agreed - this is why I was trying to come up with a segmented meta-language that would work with any schema syntax. (Though I don't claim to have rigourously tested this at this stage. What I am hoping to do is to provoke comment from experts in this field.) > > Syntaxes will determine some meaningful sequencing within a message as > described in that syntax. This will vary, at the level of the message and > below. Process descriptions - because they model movement in time - are > tightly bound up with sequence. Agreed wholeheartedly, which is why sequences are required a message level, but not at Information Set level. Note, however, that there needs to be some formal mechanism for describing messages that is computer and human understandable. This should be a cross between a commented DTD and a MIG. This is what my Message definitions are designed to provide. (The lack of a computer-interpretable format for MIGs has been a problem to the EDI industry for years.) > Data descriptions, because they are not > bound up by time (other than indirectly, by referenceing a sequence > description as part of identifying their context), should be left > sequence-free. Agreed wholeheartedly, which is why sequences are not allowed at this level (though nesting of Information Sets to create hierarchies is). >Syntaxes need to be free to use sequence as a way of > describing our semantic models for a particular implementation. Models used to validate the use of information within a specific instance need this, but this information does not need to be part of the definitions of the semantics. >All we need > capture are the hierarchical relationships inherent in the semantics of the > data structures we are decribing, and the name-value pairs that constitute > their children, in a fashion that allows both to be strongly typed. Agreed. I added such a mechanism to my paper yesterday. (I even managed to add a mechanism for typed property attributes that I think will be compatible with both DTDs and Schemas. See the last two or three of screens of http://www.sgml.u-net.com/neutral.html for details.) >We don't > need sequence in our models to do this, and we benefit by not requiring a > particular sequence at any point before syntax-specific implementation. Again I agree with you if by models you mean reusable information sets of the type being defined by the Core Components group rather than the Messages that need to be defined for particular business processes. = > I hope this clarifies ny earlier statements. It does, and thanks for taking the time to make the clarification. Martin
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC