ebxml-transport message

Subject: Comments on Reliable Messaging Spec, v0-74, 16 Aug. 2000
From: mwsachs@us.ibm.com
To: Jim Hughes <jfh@fs.fujitsu.com>
Date: Fri, 25 Aug 2000 02:00:38 -0400
2.1 Basic Concepts

Line 91, Fig. 2-1:  The sender and receiver balloons are in locations which
make it unclear what they refer to.  Please move them either below the MSH
boxes or to the outside edges of the MSH boxed and have the balloons point
directly to the message service handler boxes.

Line 98:  Please delete "globally unique".  The uniqueness is discussed in
the main body of the Messaging Service specification, to which I submitted
comments on its meaning.  Whatever the meaning, there is no need to repeat
the phrase here.

Line 102:  Please specify the string to be used for "not reliable"
transmission.

Line 112, Editor Note 3:  I would prefer that we prescribe that the RM
group size is determined by the Messaging Service based on the properties
of the underlying transport and is transparent to the From and To parties.
In my previous comments, I suggested a procedure for the From parties   to
turn reliable messaging on and off.

   NOTE:  Since there may be multiple From parties sending messages into
   the same RM group, there also have to be rules on what happens of the
   parties have incompatible off/on requests.  The problem here is that
   resolving those incompatibilities requires understanding the
   conversation states;  otherwise if party A turns reliable messaging on
   and party B turns reliable messaging off, there is no way of knowing
   which will send further messages.  This suggests that the proper way is
   simply to indicate reliable or unreliable for each message.  Only the
   ones marked reliable go into the RM group at the sender and receiver.
   However a warning is also need that if reliable and unreliable messages
   are mixed within the same conversation, the unreliable messages may be
   (probably will be) delivered out of sequence since they don't wait for
   completion of a group.

Line 137:  Please delete "globally unique".  See comment to line 98.

Line 140:  The Receiver shall not deliver a message to the higher
processing level until it has received the RM-group acknowledgment.
Reasons are given in comments below, such as the comment to line 207.

Lines 145-148:  If the final message is lost, the receiver won't send the
RM-group acknowledgment or error message since it won't receive a message
with the RM-group count greater than zero.  The sender must therefore
re-send the entire RM group.

Line 14, Editor Note 6:  I have added to my TPA suggestions list the need
to specify the number of retries and retry interval for the RM-group ACK,
just as is done for the transport-level ACK and business-level replies.

Line 152, Editor Note 7:  I suggest deleting this note unless some
information to be inquired about has been identified.

Line 157:  Editor Note 8:  There should be no need to report completion of
a group to the From party.  When the group is complete, the To parties will
receive the messages and the business-level response to each message will
eventually be sent.

URGENT NOTE:  This protocol has the strong possibility of a deadlock.  In
most cases, the business-level protocol within a conversation is
request-response, with the next message not sent until the previous
response has been received.  Since, as discussed in various comments in
this posting, the messages in the group cannot be passed upward until the
group is complete, there can be at most one message from each conversation
using that path in any group.  The reason is that the conversations cannot
advance until the From parties have received the application-level response
messages.   But no response messages will be forthcoming until the RM group
is completed.  It will never be completed if the number of conversations is
less than the group size. If, for example, the group size is 20 but there
are only 10 conversations in progress, there can be at most 10 messages
outstanding from the From system at any moment.  Once the 10th message is
received into the RM group, all 10 conversations are waiting for responses.
Therefore, the sender will pause waiting for more messages from the From
parties but no more messages will come from the From parties.  The receiver
will not receive the final message of the group and will wait forever since
it is the sender that times out.

2.2.1 Message Header -Message Data Element

Line 161:  missing period.

2.2.3  Routing Header

Line 174, Editor Note 11:  We should either identify good reasons to have
this information or delete the note.  Since the Reliable Messaging function
does not cross group boundaries, I san see no reason for the group ID.  It
should be stated that the sender shall not start the next group until it
has received the RM  group complete acknowledgment from the previous group.

Lines 177-181.  Tables 2-1 and 2-2:  Since Sequence Number and RM-Group
Count are mandatory in all messages, these two tags are mandatory and not
"additional".   Please combine the two tables into one.

2.3 Message Transfer Sequence

Line 185-186:  Please delete the last sentence of the paragraph.  The
messages in a group shall not be passed upward until the group is complete.
Since the check for missing messages is made only when the last message of
the group is received, passing the received messages upward before the
group is complete would cause the retried messages to be out of sequence as
seen by the To parties.  That should be unacceptable.  An implementation
might choose to store the messages in a higher level store until the group
is complete but the From parties must not be notified of their presence
until the group is complete.

   NOTE:  Per the above "urgent note", the out of sequence situation can
   only occur if reliable messages which have no business-level responses
   are being sent since that is the only case where more than one message
   from the same conversation could be in the same RM-group.  The out of
   sequence condition is significant only causes messages within the same
   conversation to be out of sequence. However it is best not to ask for
   trouble.

Lines 206-213, Acknowledgment by receiver:  This rule makes it clear that
messages shall not be passed to the To parties until the group is complete
since it states that the test for missing messages is made at the end of
the group.

NOTE:  I am saying "Parties" and not "Party" because the messages in the
group may belong to more than one party, i.e. more than one TPA.

Line 211:  Please replace "Sender; otherwise..." by "Sender and all
messages received into the group are passed upward to the To parties;
otherwise..."

Lines 215-216:  please replace "is the appropriate...messages," by
"indicates that all the messages in the group were received,".

Line 229-230, Editor Note 12:  The answer is that the sender must time out
and resend the entire group.  I believe that this point is covered by other
comments in the posting and no changes are needed except for addressing
those comments.

2.5  Detection of Repeated Messages by the Receiver

Line 231-249:  Please identify this section as informative (non-normative).

Line 232-233:  Please replace "using Message...is implementation dependent.
However, an effective..." by "may be by using Message Identifiers or
Sequence Numbers.  An effective..."

2.6 Reliable Messaging Acknowledgment and Error Messages

2.6.1  General

Line 258-259:  The number of retries can be prescribed in the TPA or
otherwise by agreement between the parties.

Following line 260:  Informative text is needed which gives guidance as to
how the use of the reliable messaging protocol may depend  on the transport
protocol. For example, it could be recommended not to specify reliable
messaging if the transport protocol itself has a reliable messaging
function.  An editorial note stating intent to supply this text should be
sufficient for now.

   With a TPA, reliable messaging could be specified in the TPA and an
   authoring or installation tool could check this specification for
   consistency with the selected transport protocol.

2.6.2 Reliable Messaging Formats

Following line 265:  The reliable messaging specification should eventually
be incorporated into the messaging service specification.  Since the
reliable messaging function shall be provided by all implementations,
keeping the two specifications separate will be confusing.

Line 277, TPAId and ConversationId: since messages from multiple TPAs and
conversations may be mixed in one group, there is no value to prescribing
specific values for these tags.  Please state that the values of these tags
are not significant.  A single character should satisfy the parser.

Line 278, ServiceInterface and Action:  These tags are not optional - see
the schema document in the messaging service spec. (The DTD in that spec is
missing the TPAInfo tag).

2.6.3 Error: Missing Message(s)...

Line 283-284:  The payload should be a proper XML document.

Line 295-298:  Please add that the receiver must send a RM-group
acknowledgment after all the re-sent messages are successfully received.

Lines 314-319, Editor Note 15:  I believe that this is covered by comments
above and the note can be deleted.

4.1 Reliability in Routing...

Lines 331-333:  Good question.  I suspect that reliable messaging should be
hop by hop, not end to end.  The rules would be mostly the same as now in
the spec except that for an intermediate node, the messages in a group are
sent to the next node when the group is complete rather than being passed
upward to the application.

Regards,
Marty


*************************************************************************************

Martin W. Sachs
IBM T. J. Watson Research Center
P. O. B. 704
Yorktown Hts, NY 10598
914-784-7287;  IBM tie line 863-7287
Notes address:  Martin W Sachs/Watson/IBM
Internet address:  mwsachs @ us.ibm.com
*************************************************************************************