perfsonar-dev - [pS-dev] Conveying Errors - From SOAP/HTTP, perfSONAR, or Both?

Subject: perfsonar development work

List archive

[pS-dev] Conveying Errors - From SOAP/HTTP, perfSONAR, or Both?

From: Jason Zurawski <>
To: "" <>
Subject: [pS-dev] Conveying Errors - From SOAP/HTTP, perfSONAR, or Both?
Date: Thu, 26 Aug 2010 08:10:39 -0400
Organization: Internet2

All;

Since this list rarely has traffic anymore, its time to use it to discuss a hard problem that is not solved uniformly across the different perfSONAR implementations yet: the proper way to bubble an error up from the service level.

The motivation for this is that an outside developer from the LHC community was utilizing libraries from one effort (perfSONAR-PS), talking to service from an another (HADES), and seeing some unexpected results in the case of an error (e.g. the client software was expecting to get XML, but the service returned text/html content and an HTTP status code).

I think this situation is to be expected, and does not indicate a failure on anyone's part - it just calls out for some more communication and coordination on BCP for service and client design to ensure interoperability. For this particular problem there needs to be:

1) A uniform system of error codes, either through adoption of existing standards or creation of new guidelines (Something that is being explored in the OGF)

2) A uniform way to implement error handling in all services.

I will address only #2 here for now. If anyone wants to be involved in discussing #1 I would encourage you to look into the OGF group (https://forge.gridforum.org/projects/nmc-wg). There are 2 mechanisms in place to return information and eventually handle the different situations:

1) The HTTP layer has a well known way to expose status and a rich set of error codes (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html) that are known to all who implement the protocol. SOAP has a built in way to covey errors (http://www.w3.org/TR/soap12-part1/#soapfault) as well.

The purpose of these should be to return a message from a server to a client, and relay the outcome of the original request message. E.g. if things are functioning normally, the standard response is '200 OK'. If the server is in a crisis, '500 Internal Server Error' may come back.

2) perfSONAR Messages convey status and human-readable messages in the event of errors. The current array of errors (https://forge.gridforum.org/sf/wiki/do/viewAttachment/projects.nmc-wg/wiki/20100527Notes/proposal-of-mappings-20100527v1.txt) is being converted into a more uniform system (https://forge.gridforum.org/sf/sfmain/do/downloadAttachment/projects.nmc-wg/wiki/20100415Notes?id=atch4885).

The original intent here was to convey messages from the service itself using the well known message structure. For instance if a message is ill formed or not supported (error.common.action_not_supported), or if underlying database cannot be contacted ("error.ma.storage"). This simplified client design - a client could always expect to get back well formed XML for parsing.

I will lay out two strawmen proposals; comments from service and client developers weighing the strengths and weaknesses of both is *required* if we are to make progress. This group must converge on appropriate action soon if we are to further adoption and expansion.

1) 'Ignore' the HTTP/SOAP errors and rely solely on the perfSONAR protocols. This would force all services to always return the '200 OK' response at the HTTP level, not fill in additional SOAP error codes, and *always* return a perfSONAR formatted XML message indicating some form of error.

a) This is good for service developers because it should be straightforward way to relay information - all error conditions are alike (perfSONAR XML is returned), only the codes change. It may be challenging to ensure that a success code is always sent back from the HTTP layer though, especially in the case of using 3rd party applications (e.g. web containers like jetty/tomcat or libraries like libwww).

b) This is good for client developers since they can always expect a '200 OK' error code, and always parse the subsequent response as a perfSONAR XML message. Without this they would need to guess on what the resulting content may be (e.g. text/html perhaps instead of text/xml) based on HTTP codes.

2) Use Both. A "503 Service Unavailable" is really the same as "error.ma.storage Database Not Found". There would been to be a very tight coupling between the messages however. There may need to be requirements on either always returning XML content for given result codes or if it would ever be ok to not return a perfSONAR XML message (e.g. in the case of catastrophic failures).

a) This would cause more work for service developers in needing to map the two sets of codes to each other, but does allow for supporting 3rd party HTTP containers.

b) This can be positive for clients in that the 'good' error codes will be linked to parsing routines for perfSONAR XML. 'Bad' error codes can simply be sent to logging and not parsed for content at all. Complications may arise on expecting certain content, e.g. is it correct to assume a '5XX' code would contain perfSONAR XML? Ideally the developers should ensure this, but due to third party HTTP containers this may not be possible.

Comments welcome, thanks;

-jason

[pS-dev] Conveying Errors - From SOAP/HTTP, perfSONAR, or Both?, Jason Zurawski, 08/26/2010

List archive

[pS-dev] Conveying Errors - From SOAP/HTTP, perfSONAR, or Both?