[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: TRP Error Handling Spec Draft
Marty We do want to keep it simple, however the S/390 environment where everything was under IBM's control is different from the ebXML envronment where the operating environment will by much more anarchical. So let's look at some of the issues ... 1. Development by many different vendors - if a vendor is not told what errors his software is generating he won't fix them at all or not until much later. 2. Used by SME's and the mega-corporation - this means that we need a way for SME's to easily report their problems and get them fixed. We actually thought about this when developing IOTP and one of the things we included was an "ErrorLogNetLocn" (see below) for an extract from the spec. The ideas was that you could specify a URL where errors found in the messages sent by a company could be sent to that company so that could fix them. In theory, this log location should only occur during testing as all software as we know is bug free in production ;) David ============== Extract from IOTP spoec ... ErrorLogNetLocn Optional. This contains the net location that Consumers should send IOTP Messages that contain Error Blocks with an Error Component with the Severity attribute set to either: o HardError, o Warning but the Consumer decides to not continue with the transaction o TransientError and the transaction has subsequently timed out. This attribute: o must not be present when TradingRole is set to Consumer role, o must be present when TradingRole is set to Merchant, PaymentHandler or DeliveryHandler. The content of this attribute is dependent on the Transport Mechanism see the Transport Mechanism Supplement. The ErrorLogNetLocn can be used to send error messages to the software company or some other Organisation responsible for fixing problems in the software which sent the incoming message. See section 7.21.1 Error Processing Guidelines for more details. -----Original Message----- From: mwsachs@us.ibm.com [mailto:mwsachs@us.ibm.com] Sent: Sunday, August 27, 2000 3:40 PM To: David Burdett Cc: ebXML Transport (E-mail) Subject: Re: TRP Error Handling Spec Draft This proposal is well thought-out and well described. My concerns mainly relate to the KISS principle. Is this a level of complexity and detail that is really needed? An alternative at the far other end of the spectrum is to simply log everything and rely on out-of-band communications for the messaging service instance which received the message in error to notify the messaging service instance which sent it. When I was working with the IBM Poughkeepsie architecture team on the S/390 fiber optic channels, we accumulated a long list of errors which we wanted reported in a response message. The I/O engineers then informed us that they were not going to analyze all this information. Rather, they would log the error condition and take the most drastic recovery action (called connection recovery) whatever the error condition. The recovery action is simply to terminate the logical and physical connections between the channel and I/O device and let software retry. The principle they were operating on was KISS. They said that ecovery function is the most difficult to test and debug. 2.1.2 Types of errors Line 21 (2nd bullet, 3rd subbullet): Please add "of a previously sent message" to the end of the line. Following line 21: What about a category of "an error in a document which is reporting an error"? 2.1.3 When to Generate Error Messages Line 30 (1st para.): Please change "message in error" to "received message". Line 30 (1st para.): Please change "a URL to send the message" to "an address to send the message". If the communication protocol is other than HTTP, the address won't be a URL. NOTE: The IBM tpaML proposal covers only retries of transport timeout errors and business-level errors. Business-level errors are reported to the sending application using <ExceptionResponse> and a corresponding action at the other partner which receives the exception response. NOTE: It is not clear where the errors covered in the error-handling proposal should be reported to. They are conditions presumably caused by the sending message service instance and detected by the receiving message service instance. We will need to define some kind of "catcher" for the error messages and define what the catcher is supposed to do. 2.2 .1 Communication Protocol Errors Line 55 (Editor's note): The KISS principle applies here. I suggest retrying communication-level errors at the communication level and reporting all unrecoverable conditions to one place rather than defining a unique recovery for each communication-level error. Aside from simplifying the implementation, it will avoid the need to track the other standards for additional error codes. I speculate that the function to which these conditions are reported will simply log them and process all of them the same way. 2.2.2.1 Identifying the Error Reporting Location 1st paragraph, line 63: Please change "a URL" to "a communication address". The communication protocol will not necessarily be HTTP. Line 65, by reference: The TPAId is always in the ebXML message header Line 66, by value: Please replace "the URL" by "the communication address". Line 66, by value: A header error will make the value of such a tag untrustworthy. I suggest that we limit ourselves to "by reference" and "implicitly". Line 74-76 ("Even if the message in error..."): Determining the error reporting location is not a problem with either "by reference" or "implicitly". 2.2.2.2 Error Reporting Location is Identified Line 80-82 (Editor note): For communication errors, replacing an acknowledgment message by an error message is OK (example: HTTP). If the message is sent at the messaging service level, it is not clear where the error message should go. Possibilities are: All the way to the application (e.g. the exception response message in tpaML). Some messaging service management function that we haven't yet considered. I believe that this is the preferable approach for errors in the header and envelope. 2.2.2.3 Error Reporting Location not Identified Line 84-86 (1st para.): This is reporting the error at the transport level. Line 87, editor's note: Specifying rules to apply to each transport requires defining the layer of function that invokes the transport protocol. Lines 90-91 (If a suitable return address is not identified): This is my preferred approach for error processing in general. If we define a messaging service management function, then the error information could be sent to the management function at the sender and logged there, as well as being logged at the recipient messaging service management function. 2.2.3 Transient Errors Line 92 (title): Please change the title to "Recoverable Errors" and use the term "Recoverable" instead of "transient" throughout. "Recoverable" means that retry is recommended. "Transient" means that the error did not recur, which can only be known after the retry succeeds. Lines 105-106 ("If the ebXML messaging Services is unable to respond..."): If the messaging service is unable to respond within the timeout period, it is probably unable to send the message which says it is unable to respond. Line 107 (determining the timeout period): The TPA is one place to specify the timeouts. 2.3.2 ebXML Error Document Lines 136-137 (Editor's note, list item 2): Since correcting the errors will require human intervention, logging the error condition should be sufficient. As noted above, the sender needs to be notified of the error condition. This notification could be outside the scope of the specification. Alternatively, the error message structure defined here could be used for that purpose as long as it is sent to a management function at the sender and does not require executing a protocol to evaluate the error message as part of the exchange of messages between the parties. Line 139-40 (list item 4): I completely agree. 2.3.5 ebXML Error Document Type Definition Lines 256-258 (Editor's note): Standard practice that I am familiar with is to terminate the operation on the first error detected and report only that error. 2.4.2 Non-XML Document ERrors Lines 330 (MsgTooLarge), 332 (MIMEError), 333 (MimePartMiss), and 336 (MimePartUnx): These conditions appear to be application-level errors. These could be reported directly to the sending application using a mechanism like <ExceptionResponse> as defined in the tpaML proposal. Regards, Marty **************************************************************************** ********* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com **************************************************************************** ********* David Burdett <david.burdett@commerceone.com> on 08/18/2000 08:17:19 PM To: "ebXML Transport (E-mail)" <ebxml-transport@lists.ebxml.org> cc: Subject: TRP Error Handling Spec Draft Folks I attach some light reading for the weekend. Alternatively called, the TRP Error Handling draft spec version 01 ;) David <<ebXML TRP Error Handling draft 01.doc>> <<ebXML TRP Error Handling draft 01.pdf>> Product Management, CommerceOne 4400 Rosewood Drive 3rd Fl, Bldg 4, Pleasanton, CA 94588, USA Tel: +1 (925) 520 4422 (also voicemail); Pager: +1 (888) 936 9599 mailto:david.burdett@commerceone.com; Web: http://www.commerceone.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC