perfsonar-dev - Re: Lookup Service performance tests (1)

Subject: perfsonar development work

List archive

Re: Lookup Service performance tests (1)

From: "Jeff W. Boote" <>
To: Maciej Glowiak <>
Cc: Roman Lapacz <>, Loukik Kudarimoti <>, Vedrin Jeliazkov <>, Martin Swany <>, Jason Zurawski <>, Nicolas Simar <>, Szymon Trocha <>, Eric Boyd <>,
Subject: Re: Lookup Service performance tests (1)
Date: Tue, 11 Jul 2006 14:49:54 -0600

Maciej Glowiak wrote:
---------------------------------------------------------------------

Conclusions:

1. Queries that produce a lot of result data took more time (what is
obvious of course), so better is to ask LS for smaller set of data

I suspect that once you are querying the LS from across the network you will find that there is a 'sweet spot' size that is in fact probably a little larger than what you currently call a 'smaller set'. And, that size will be at least somewhat dependent upon the RTT to the server from the client.

Basically, the prorogation delay for TCP handshake could end up consuming a large enough fraction of the over-all request response time from the client point of view that you will want the response to be large enough so that you don't have to do TCP setup again. For example, if jumbo-packets are being used I would expect all results of up to about 9K to take about the same amount of time from the client perspective.

2. For smaller queries one of the most significant times is consumed by
conversion to NMWG. It's unnecessary overhead in this case, because
LSQuery needs only to extract XQuery expression from request. I don't
want to discuss here whether we should remove NMWG or not, but the
fact is that time of such conversion is significant

In point 1, you state that large queries take more time. Unless the conversion to NMWG takes more than linear time (and I would expect it to be better than linear) I think concentrating on it here is misplaced. If you read the previous threads we have discussed this topic several times.

If I'm not mistaken, we all agree for the case of the LS conversion to nmwg is less efficient. However, there are many other cases where it is more efficient. (Especially if you don't convert to DOM at all.) Unless you have a more compelling reason than LS performance (because you have already said that it is pretty much fixed time for that case), my opinion is that it is easier to use the nmwg classes.

If we need to change anything here, it is my view that we should be changing the parsing model. Not messing with the representation of the marshaled objects. If we used an on-demand parsing model, it would be possible to detect that some sequence of XML elements is being parsed within an LS message type, and that the contents should just be handled as a string for XPath/XQuery parsing. This eliminates the nmwg overhead for LS message types.

Then, for other message types the nmwg classes could still be used - because for many of them it is more efficient and much more easy to use and understand.

3. Another significant time (which wasn't measured here directly, sorry)
is initialization of eXist DB XML StorageManager. In Tomcat this
StorageManager is initialized only once.

4. The most significant time is querying the eXist DB server. We can't
speed it up with old StorageManager which uses XML:DB for communi-
cation (default way of communication to DB server from Java), so I
wrote new HTTP access to the DB server which is much faster for
smaller queries. Additional tests will be provided soon, but it's
already on SVN repository

Is there any downside? What functionality is provided by XML:DB that is not provided by your HTTP access? Do we care about any of that functionality? This sounds like really good, useful work to me. But, I would like to know what (if anything) we are giving up.

--------------------------------------------------------------------------

Bottlenecks:

Two main bottlenecks (except Axis/Tomcat which wasn't tested this time) are communication to eXist DB by XML:DB API (see conclusions: 3 & 4) and conversion to NMWG (see conclusion: 2).

--------------------------------------------------------------------------

Required improvements:

1. StorageManager (use new HTTP Storage Manager)

I'd like a few more details - but this sounds absolutely reasonable.

2. NMWG (?)

As I said above, I don't believe the problem is the nmwg classes specifically - but the parsing model we are using.

jeff

Lookup Service performance tests (1), Maciej Glowiak, 07/11/2006
- Re: Lookup Service performance tests (1), Jeff W. Boote, 07/11/2006
  - Re: Lookup Service performance tests (1), Maciej Glowiak, 07/12/2006
    - Re: Lookup Service performance tests (1), Roman Lapacz, 07/12/2006
      - Re: [pS-dev] Re: Lookup Service performance tests (1), Maciej Glowiak, 07/12/2006

List archive

Re: Lookup Service performance tests (1)