[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Reliable Messaging Spec v0-078
Here are my comments on RM v0-078. 2.2.2 Message Header - RM Info 154: Please change "temporarily persistent" to "persistent" everywhere. "Temporarily persistent" is a contradiction in terms. Somewhere in the spec there can be a non-normative discussion of when a message can be discarded from persistent storage. In fact, unless the message is retained in persistent storage by at least one of the parties until it has been completely processed by the higher level, it can be irretrievably lost under failures such as software or node failure. 155: Please replace "requests unreliable messaging semantics" by something like "indicates that the RM service is not to be used". The parties may have an alternative way of obtaining reliability such as by using a reliable transport protocol. 162: See my comments on section 7. 167: I agree that "unspecified" is unclear but so is "best effort" (see comments to sect. 7). We need a keyword which simply says that this RM service is not to be used. 2.2.3 Routing Header 173-176: What new function does MessageServiceId provide that isn/t provided by SenderId and ReceiverId? These implicitly define a specific instance of the MessagingService. 184: (Table 2.2): 3rd line and first bullet: Please replace "transport" by "messaging service instance". 3rd bullet: "a long time" is ill defined. One partner may choose a different time to reset the sequence number than the other, thus leading to spurious error indications. We should either define an event which is visible to both parties that causes reset of the sequence number or require that the sequence number be preserved "forever". Even "for the life of the messaging service instance" is ill-defined. 2.3 Message Transfer Sequence 187: Please state that the message shall be stored in persistent storage before sending the acknowledgment message. Even though this is described later, this is a key point that should be mentioned up front to avoid confusion. 190: State that the next message cannot be sent until the acknowledgment to the previous message is received. 193: Fig. 2.2 does not (and should) make it clear that the receiver must put the message in persistent storage before sending the acknowledgment. 205-206: This could be interpreted to read that the message is "processed appropriately" before step 4 takes place. Please delete "and processes the message appropriately". An informative note could be added that it is permitted to pass the message to the application for processing concurrently with steps 4 and 5. 214: Please add "or later failure recovery". 2.5 Detection of Repeated Messages 225-226: It may be worth repeating here that the sequence number is actually qualified by the senderId. 234: Yes, a duplicate ACK shall be sent. The sender may be (probably is) sending a duplicate because it didn't receive the first ACK. Do we need to require that a timer be set on the ACK? 235: (note following this line) (a) See my comment above regarding "for a long time". (b) This is precisely why "for a long time" is not a proper specification (see comment above). 273: (3) Since this algorithm specifically uses the sequence number, case 3 is by definition identical to case 2. Only the sender can know that the second message is not a duplicate even though it has the same sequence number. This cannot happen if we agree to my foregoing comments about "for a long time". 2.6 Reliable Messaging Acknowledgment 2.6.1 General 246: At this time, the messaging service cannot handle communications protocol function since it is, by definition, blind to what is going on at the transport level. See my comments to section 2.6.3. 248: (3) Please change "timeout" to "messaging service timeout". 248: (4) We have not provided a separate definition of transient errors. An error cannot be declared transient until it has been recovered. A transient error manifests itself as a timeout for which the retry succeeds. A repeated-sequence-number error is a messaging error, not a transient error. 2.6.2 Reliable messaging formats 270: Please delete this line. ServiceInterface and Action are present in all messages, whether reliable or not. 2.6.3 Communication Protocol Errors 274-292: In the absence of a set of definitions which specify the information flow between the MS and the transport level, the MS is blind to what goes on in the transport level. Architecturally, the functions defined here are part of the transport level. In practice, an implementation may or may not have a boundary between the messaging service and transport level but that is an implementer's choice. Please remove this section. See my comments to section 3 below. 2.6.5 Timeout 302: Please replace "final" by "previous". 306-307: This is going on in the sender. The sender does not return an ACK or error message. 2.6.6 Transient Errors 314ff: See my comment on transient errors above. 2.6.8 Maximum Number of Retries... 343: Please supply a definition of "retry interval". It is the minumum required time between successive retries. Please remove the keywords RetryInterval and Retries. The TP team has not yet started defining tag names. There are two other such pairs also to be named (transport and business level). Please state that the retry interval and number of retries discussed here are specifically for the reliable messaging service. 348-350: Please delete this statement or make it non-normative. The recovery action following such an error could be quite different, even perhaps involving a reboot. 3. Relationship with Transport Level 353ff: This section appears to be about the MS in general, not about RM. As such, it does not belong in this document. This can be considered the beginning of the work we need to do on tying the MS and the transport level together. I suggest putting this section and section 2.6.3 into a separate draft proposal and continuing to flesh it out on a time scale appropriate to the deliverables schedule. 5. TPA Considerations 392ff: It should be made clear that the actual tag names are TBD by the TP team. 7. Definitions Per an editor's note earlier in the document, the semantics terms need reconsideration. Until this point, I understood "atMostOnce" to be what the RM service provides. I think I agree that as currently defined the RM service provides ExactlyOnce. In this section, "atMostOnce" seems to mean that the RM service provides duplicate detection but not retries. That may be so but this spec does not provide that option. "BestEffort" is out of scope for the MS because it specifies what the parties (i.e. application level) do. Please delete "BestEffort" and the last 2 bullets (party definitions) of "AtMostOnce". From the MS viewpoint, the two are the same. In addition, we need a choice that specifies "RM service not used". For this option, there are neither duplicate detection nor retries. Regards, Marty ************************************************************************************* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************************* Jim Hughes <jfh@fs.fujitsu.com> on 09/23/2000 06:59:43 PM To: ebxml-transport@lists.ebxml.org cc: Subject: Reliable Messaging Spec v0-078 Attached are Word and PDF copies of the latest version of the Reliable Messaging Spec, for discussion on Wednesday in the F2F. Shimamura-san has added much more detail on reliability issues, and of course the earlier use of Message Groups is now deleted for simplicity. Since there were many changes, I deliberately did not mark changes in this document, but you can easily see them by using Word's compare utility against the previous version. [Ralph, could you have some copies of the document available on Tuesday, for those that might not have access to a printer before travelling to Dallas?] Jim
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC