[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Comments on Reliable Messaging Specification, Aug. 11, 2000
Jim, I am generally pleased with your responses to my comments. I do have a few rejoinders, embedded in the following extracts from your posting. Regards, Marty ************************************************************************************* IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************************* Jim Hughes <jfh@fs.fujitsu.com> on 08/16/2000 01:02:36 AM To: Martin W Sachs/Watson/IBM@IBMUS cc: ebxml-transport@lists.ebxml.org Subject: Re: Comments on Reliable Messaging Specification, Aug. 11, 2000 Marty, Inserted below are my comments on your email, especially how I resolved them in the latest version of the RM spec. Thanks for the comments... Jim > NOTE WELL: because each item in the window is a complete >application-level message, any implementation limit on the window size sets >a limit on the maximum application-level message size, which may be >unacceptable. We must be very careful about imposing message size limits >on the application. The application design may prevent splitting one >message into smaller messages; hence window size limits could prevent >support of some applications. Reliable transport protocols deal with this >issue by segmenting the messages underneath the application and windowing >the segments. Think about IP underneath TCP and the sliding window >protocols in HDLC and the LLC layer of the LANs. Again, we are not covering logical message splitting in this RM spec. MWS: I agree with not covering logical message splitting. My concern is that implementation limits on the message size or total storage capacity of the RM-group may arise, in which case the spec will have to at least provide guidance. The suggestion that maximum message size may have to be specified in the TPA troubles me because, as I said, that becomes a matter of what applications cannot be supported. I do not have a good solution other than using transparent segmentation of messages such that the total message size remains an application matter. Perhaps an editor's note warning of a possible message size issue might be appropriate in order to get people thinking. > >Line 123, item 7: Observation: The usual sliding window protocols are full >duplex with regards to messages and ACKs, and there is a pause only on >detection of a lost message. The protocol specified in this document is >not a sliding window at all; it is more like a "jumping window" >protocol - it is half duplex and there is a pause on every window. That is >a serious degradation of message latency and throughput compared to sliding >window protocols. Another reason why I changed the name to "RM-Group". MWS: If "sliding" is also gone, I am content except for the latency question. >Line 136, item 9: Please replace "For only the last message..." by "To >detect loss of the last message..." The statement in the specification is >an implementation statement. For example, the sender could choose to set a >deadline for each message and slide the deadline forward until the last >message of the window. This would enable early detection of "hard" >failures. My suggested change avoids stating a requirement that the >timeout may only be set on the last message. Change made at beginning of the sentence. The reason for saying that a timeout is specified for *only* the last message of an RM-Group is to avoid having timeouts for *all* messages in the RM-Group. The Sender finds out that messages (other than the last) in an RM-Group never arrived by getting an error message in response to the last message. The Sender recovers from non-delivery of the last message by using the timeout. MWS: My concern is about appearing to constrain the application. If the change eliminated the word "only", then I am satisfied. >Line 137, item 9 ("information from the TPA"): It is not obvious that a >separate timeout is needed for reliable messaging. The existing >transport-level timeout as defined in tpaML section 2.6.4 may serve the >purpose. However, this point requires considerably more thought. As it >stands, it is not clear to me that the complexity of the window timeout is >worth the value added. A much simpler solution for this 1-out-of-N case >(loss of the last message) is to rely on the normal transport-level timeout >(e.g. the time to the HTTP response). Simply terminate the window. The >messaging service will simply time out at the transport level and re-send >the message, starting a new window. This, however, leads to the following >considerations: One of the major rationales of this proposal is to make *no* assumptions on the underlying transport (the "carrier pigeon model"). Thus, we don't introduce the concept of a "normal transport-level timeout". If we lift this assumption, then obviously other solutions are possible... MWS: But ACKs are a fact of life for at least some of the transports, including HTTP. Some discussion of possible interactions between the reliable messaging protocol and the underlying transport is needed. A clarification is needed, for example, as to if a RM-group size of 1 is used, that there will be both a RM ACK and the HTTP ACK. A recommendation is needed about whether the transport ACK should be supporessed when reliable messaging is used, for protocols which permit suppressing the transport ACK. >In this protocol, there seem to be two possibilities regarding the timeout: > > The normal per-message transport-level timeout is not used with reliable > messaging - but this extends the time to retry a lost message to the > time to fill the window. Yes, you are correct. However, the Sender MSH can minimize the number of messages in an RM-Group if this is a problem (or even turn off RM functions in the MSH layer if he wants to just use known transport layer functions and not expect any kind of RM-layer ACK/error message from the receiving MSH. I would expect that scenario if the transport is inherently reliable. MWS: Some discussion of this should be in the specification. Perhaps an Editor's note on the need to add this discussion in the future would be advisable. > The per-message transport-level timeout is still used on top of the > reliable messaging protocol. In this case, the reliable messaging > protocol must NEVER retransmit a message in the window if it was > successfully received since the upper level already knows that the > message was successfully received. (Perhaps discarding the duplicate is > sufficient; I am not certain of this.) > I haven't formed a firm opinion on the TPA and its use in ebXML transactions, but I am troubled by its size and complexity. How do we implement things such as "it is strongly recommended that the framework implement and end-to-end acknowledgment" (Note, section 2.6.7.3)? Especially, it seems to me that the TPA is present to describe the profiles of two parties, and there is no TPA mandate that the parties SHALL implement some kind of reliability function or other protocol... that's the function of other documents. MWS: "Strongly recommended" is an informative (non-normative) statement. It may indeed be that a lot of the text in the tpaML proposal really belongs in other documents. Given the scope of ebXML, a document which provides guidance to implementers of the messaging service would be a very valuable document. I view the RosettaNet Implementation Framework document as an example of such a document. I felt it important to capture all these points in my proposal until I could eventually determin where they below. As to TPA size and complexity, our experience in IBM Research was that all these elements are needed for B2B between large enterprises. The TP team will need to determine how to structure the specification to not be forbidding to SMEs (e.g. by making virtually all elements optional in the XML sense). In addition, part of the complexity problem can be addressed by a tpa-aware authoring tool which guides the tpa writer. My research team prototyped such a tool. MWS: Incidentally, Reliable Messaging probably makes it unnecessary to implement the tpaML "strongly recommended" end to end ACK with SMTP. SMTP is one case where reliable messaging is a clear win. If both MSHs operate on a "persist and ACK" each message, as you describe, then you just need to define if the ACK is a transport-ACK or an RM-ACK. In the latter case, we would use RM functions and set the RM-Group size to 1. Does this make sense? MWS: The discussion in tpaML relates to a supposed implementation does not have the reliable messaging function that we are defining. Your proposal sounds good. Some explanatory words would be useful. > >Line 173, editor note 12: As discussed earlier, the window count should >not be visible to the parties. It must be established and managed by the >message service handlers. This is not entirely true. The From-Party (see Figure 1) may have valid reasons to tell the Sending MSH that a group of messages must be sent reliably, and it would have nothing to do with the characteristics of the underlying transport. Quite possibly the From-Party is interested to know only when the group of messages was reliably sent. We need to define the interface to the From-Party to lock this down. MWS: I agree with "send this message reliably". I would prefer that the applications not have to deal with the RM-Group count which, as noted above, I view as a function of the characteristics of the underlying transport and perhaps implementation factors. I view "send reliably" as something to indicate for each message via the as-yet-undefined BP to TRP service interface. The first message without "send reliably" would terminate the final RM-group without error. That way, the number of messages to send reliably is not dependent on knowing the RM-group size. Saying "send the next N messages reliably" is a problem since after saying "send the next N reliably", the application could take different paths with different numbers of messages. Another alternative would be to turn on reliable messaging once and keep sending until an explicit turning off of reliable messaging. However there is still the need to deal with a short final RM-group. >2.3.3 Routing Header > >Line 179, Editor Note 13: If it is intended that the messages in a single >window can be from various TPAs and various conversations, then the message >service instance must be identified. Be careful, however, because the >latency created by such a window affects all TPAs and conversations, >especially when retries are performed. If there is a separate message >service instance for each conversation, then the window can be smaller and >retries in one window need not delay other conversations. In this case, >the conversation ID is sufficient to identify the message service instance. I'm not sure what you are proposing here. RM doesn't know about conversations and other items identified in the Header. MWS: I agree that RM as currently specified doesn't know about conversations, which is why I am concerned that the RM-group latency affects all conversations. It appears, from the current spec, that once reliable messaging is turned on by one application, it is applied to all concurrently running conversations over that transport channel in all applications, whether they want it or not. >Regards, >Marty >************************************************************************************* > >IBM T. J. Watson Research Center >P. O. B. 704 >Yorktown Hts, NY 10598 >914-784-7287; IBM tie line 863-7287 >Notes address: Martin W Sachs/Watson/IBM >Internet address: mwsachs @ us.ibm.com >*************************************************************************************
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC