Skip to Content.
Sympa Menu

perfsonar-dev - Re: [pS-dev] Re: Topics for the next meetign in Berkeley

Subject: perfsonar development work

List archive

Re: [pS-dev] Re: Topics for the next meetign in Berkeley


Chronological Thread 
  • From: "Jeff W. Boote" <>
  • To: Jochen Reinwand <>
  • Cc: Nicolas Simar <>, Peter Holleczek <>, "Eric L. Boyd" <>, Szymon Trocha <>, Loukik Kudarimoti <>, Matthias Hamm <>, Mark Yampolskiy <>, "" <>, Klaus Ullmann <>
  • Subject: Re: [pS-dev] Re: Topics for the next meetign in Berkeley
  • Date: Tue, 14 Aug 2007 13:16:52 -0600

Jochen Reinwand wrote:
On Tuesday 14 August 2007 03:14, Jeff W. Boote wrote:
P.S. I'm going to go into some specific details here regarding the
Hades/AMI-OWAMP solutions for one-way delay. For those of you more
interested in the big-picture than the details, you might want to tune out
now. :)

Thanks for providing such a detailed view on the issue!

I likewise appreciate your response and additional details. I think this is the only way we can move forward here...

Hades utilises POSIX real time capabilities of the operating system, so on one hand it is more sensitive to slightly degraded performance, but might preempt services that do not run with elevated priority.
But using two interfaces also brings in operating system related issues. Perhaps Hades measurements are affected by AMI measurements and/or vice versa, although they are running on different interfaces. We have no safe knowledge about this issue and it is most likely both hardware and operating system dependent.

I agree, there are definitely system dependent issues. We won't know for sure if both systems can coexist on a host until we try some of these things.

There is perhaps another issue, because owamp (as bwctl/iperf) can only use the default routes with the lowest metrics afaik. Because of this we are using bwctl on the default interface of the GEANT machines and Hades on the secondary interface. If both bwctl and owamp do not provide a means to override the regular routing tables, measurements for those two systems will have to be carefully scheduled in order to not overlap. Alternatively, there might have to be separate measurement boxes for BWCTL and OWAMP, while Hades could run concurrently to BWCTL or OWAMP measurements within the required accuracy of the LHC project (in the case of running parallel with OWAMP maybe even better).

bwctl/iperf/owamp all allow you to bind to a specific address. In fact, they each specifically bind to the address that was requested - it is not an option. So... if the host routing configuration allows source-based routing to be defined (and Linux at least claims to) the default route option is not required.

We have a software distribution available for the "client stuff" (software running on measurement points) for Fedora. But, as Jeff mentioned, for Hades we need this one(!) central server. Installing this server and/or creating some sort of distribution so that others are able to install it, is far more complex. Also the clients must be configured correctly so that they can be used. This includes configuring the interface(s), opening the firewall, enabling ssh access and so on. This is, of course, nothing a software package should do. But I believe Jeff has more or less the same problems building a software distribution for AMI ;-)

Yes - the client distribution is much easier than the data collection part. This part of AMI is already done as well. :)

Regarding the frequency of testing, this is a quite freely configurable parameter of Hades. In the LHC scenario with a strong hierarchy of tiers, it is possible to go away from the fully meshed paradigm and e.g. set up packet trains every second on each measurement path from tier 0 to tier 1. This will give a much higher frequency of measurements.

Hmm... Likewise, the specific schedule of packets is configurable for AMI. So, it would be possible to indicate that packet pairs or even trains of packets be sent at a given frequency. So, AMI could provide very close to the same data. I had not considered this before. (Thanks for the idea. :) )

Regarding the utility factor of OWD and IPDV, it should be pointed out that IPDV is for the most part a function of OWD. Over-simplifying it, we could consider IPDV to be the derivative of OWD. So having a highly precise OWD measurement eliminates many false positives which otherwise have to be eliminated by applying statistical methods. It is our strong belief that those methods like percentiles would eliminate visible phenomena which might otherwise not be visible because they can not be distinguished from e.g. colliding measurement packets. Such colliding packets exhibit basically the same properties as queueing events and the resulting IPDV.

IPDV does not depend upon the absolute values of OWD. Only relative stability of OWD. That was the only point I was trying to make. This is equally true for Hades and AMI/owamp. I was simply trying to address the GPS requirement - if you are more interested in IPDV, and not as concerned with the absolute value of OWD, you can get by with clocks that are not as precisely synchronized. They do however need to be stable.

I think it would be very interesting to compare the Hades data with the AMI data across the same paths. AMI does rely on the statistical methods as you state. It was our view that since owamp is a user-space application it would require some statistical analysis to be useful no matter how tightly we controlled things (to deal simply with UNIX scheduling issues). Additionally we didn't want to tightly control things because we were trying for wide deployments, we did not have an appliance as a goal.

Just as Hades can adjust the schedule to test more frequently, AMI can choose higher percentiles (and other statistical methods) to see more of the variability. If run on otherwise quiescent hosts, these statistical methods could be much more tightly configured than we normally use. For AMI, it really comes down to choosing a statistical method that can filter out host phenomena and leave in network phenomena. I agree that our current statistical filters could be removing some of the more subtle network phenomena.

Hades filters out host phenomena by tightly controlling the schedule. AMI filters it out using statistical methods. We have each chosen our blind spots. :)

I highly suspect the only way we could really determine which filters more appropriately is to run them side-by-side. And, we could probably both improve our methodologies by seeing that comparison.

jeff



Archive powered by MHonArc 2.6.16.

Top of Page