perfsonar-dev - Re: [pS-dev] Re: Topics for the next meetign in Berkeley

Subject: perfsonar development work

List archive

Re: [pS-dev] Re: Topics for the next meetign in Berkeley

From: "Jeff W. Boote" <>
To: Nicolas Simar <>
Cc: Jochen Reinwand <>, Peter Holleczek <>, "Eric L. Boyd" <>, Szymon Trocha <>, Loukik Kudarimoti <>, Matthias Hamm <>, Mark Yampolskiy <>, "" <>, Klaus Ullmann <>
Subject: Re: [pS-dev] Re: Topics for the next meetign in Berkeley
Date: Mon, 13 Aug 2007 19:14:04 -0600

Nicolas Simar wrote:

2. For Hades a lot of things, and I really mean a LOT of things need to be done if some special requirements have to be met for the LHC commitment. But at the moment there is quite a lot of obscurity about the actual requirements. Time for choosing the "right" developments to be done is running out! I rather would start development before the Berkeley meeting, not afterwards ;-)

Seems that the current solution suggested to LHC is the one of an applicance. (Hades for the time being).

Why would an appliance imply "Hades for the time being"? I'm not saying that it shouldn't be Hades, but that seems like a jump without justification.

As far as I remember from the LHCOPN meeting there was discussion about the fact that a LOT depended upon what software was actually on the host. And there seemed to be a bit of concern about how much local control would be possible.

I believe Loukik left with an action item to come up with a more detailed description about what would be offered in an appliance. (Loukik, am I remembering correctly?) I specifically talked to Loukik about the possibility of including the pingerMA and possibly even some of our SNMP MP/MA code (because we have some SNMP collection code) on the appliance.

jeff

P.S. I'm going to go into some specific details here regarding the Hades/AMI-OWAMP solutions for one-way delay. For those of you more interested in the big-picture than the details, you might want to tune out now. :)

Generally, I agree with what Jochen said. Integration of the two systems does not make a whole lot of sense. They are really set up with very different assumptions and goals.

To be honest - I believe the easiest solution for LHC is to include both. It would not be too difficult to include two interfaces on the host and run both AMI/owamp and Hades. (The AMI code is already part of the LHC bundle to support the bwctl regularly scheduled tests anyway. And, owamp is almost certainly the easiest solution for the on-demand tests. And, if it is running on a different interface - I suspect Hades would be much happier.)

Hades and AMI/owamp really do give you different data. And, I don't see how you could run them both on the same interface unless Hades was more forgiving of other traffic. I saw that Jochen thought they might be forgiving of *some* additional traffic for on-demand, but I'm fairly certain they would not be as forgiving of the amount of traffic that AMI/owamp generally produces.

AMI/owamp runs continuous streams of 10 packets per second between all senders to all receivers (using an exponential distribution). There is no real 'schedule' because it is continuous. Basically I don't see a reasonable way to come up with a 'schedule' to accommodate both testing methodologies on a single interface.

I am going to attempt to outline the differences I see in the methodologies. I absolutely admit that I am biased here, and I invite Roland, Jochen, Stephen or anyone else from DFN to provide the alternate view. I'm also going to say from the beginning that I don't think either methodology is incorrect, or generally better than the other. It is more a question of what pathologies you are attempting to see and what assumptions you make about how much centralized vs. local control.

AMI/owamp was tuned for trying to see IPDV issues. We are not as concerned with the actual 'real' delay between hosts. AMI/owamp does that, but the way we collect the data is more tuned toward looking for changes at as fine of a time resolution as we can possibly do it without hitting false-positives for IPDV. This is why we test continuously. (For example, routing-flaps are often very short lived events. Especially if they are on lower-layers - say a sonet reroute. These are the kinds of events we are looking for.)

Additionally, we are looking for congestion (queuing events). In this, I believe Hades is probably more accurate (when it is actually testing). Since they send a 'train' of packets, they will get a more accurate view of the distribution caused by queuing. However, because they send those trains fairly infrequently - they are unlikely to see many of the queuing events that AMI/owamp will see (although, with not as great of precision).

For AMI/owamp to give you good data (IPDV as I indicate above) - what you really need is a stable clock. It does not actually need to be extremely precise. It needs to be reasonably precise because you want to bound the drift between systems. (NTP to 4 nearby peers is typically sufficient.)

AMI/owamp has a bit more of a distributed model with regard to deciding what test peers to run with, and where to send the data. There is no global scheduling required, and multiple meshes can co-exist with each other. For example, each Tier-1 could additionally run an AMI/owamp mesh with several Tier-2 centers without interfering with the Tier-0/Tier-1 mesh. And, no coordination would need to take place.

I believe Hades is superior at seeing the actual real delay between points on the network. The boxes are very well tuned for this. Part of the way Hades does this is to tightly control the schedule of when packets will be sent/received at a particular host to ensure there is no contention. Additionally, Hades is almost certainly more precise with regard to IPDV during the packet-train.

The price for this precision is the granularity of the test between any two test peers. You won't see the events that happen in-between tests. The other price for this precision is the need for scheduling at a global level. On the other hand, if you want to have a single operations entity looking at the data with NOC alarms and such. I suspect Hades is more mature in that area. (I don't know, but it sounds like it.)

As far as distributing the code, I believe Hades and AMI are in similar shape. Previously, I have used AMI to run measurements on Abilene and I have helped a few other domains install/run the code. I do not yet have a real distribution available. (Although, I have promised to do that in September...)

Re: Topics for the next meetign in Berkeley, Nicolas Simar, 08/09/2007
- <Possible follow-up(s)>
- Re: Topics for the next meetign in Berkeley, Eric L. Boyd, 08/09/2007
  - Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Nicolas Simar, 08/10/2007
    - Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jochen Reinwand, 08/10/2007
      - Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Nicolas Simar, 08/10/2007
        
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jochen Reinwand, 08/13/2007
        
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jeff W. Boote, 08/13/2007
        
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Loukik Kudarimoti, 08/14/2007
        
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jochen Reinwand, 08/14/2007
        
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jeff W. Boote, 08/14/2007
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Roland Karch, 08/15/2007
        Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Jeff W. Boote, 08/16/2007
    - Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Mark Yampolskiy, 08/14/2007
    - Re: [pS-dev] Re: Topics for the next meetign in Berkeley, Maciej Glowiak, 08/14/2007

List archive

Re: [pS-dev] Re: Topics for the next meetign in Berkeley