Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] meshconfig bandwidth tests default interval

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] meshconfig bandwidth tests default interval


Chronological Thread 
  • From: Andreas Haupt <>
  • To: Shawn McKee <>
  • Cc: Marian Babik <>, perfsonar-user <>
  • Subject: Re: [perfsonar-user] meshconfig bandwidth tests default interval
  • Date: Mon, 07 Mar 2016 14:30:46 +0100
  • Organization: DESY

Hi Shawn,

(just added perfsonar-user list again)

On Mon, 2016-03-07 at 07:09 -0500, Shawn McKee wrote:
> Hi Andreas,
>
> I think this is all in regards to the OSG/WLCG perfSONAR bandwidth mesh? I
> assume your nodes are using the OSG auto-mesh URL as described in
> https://twiki.opensciencegrid.org/bin/view/Documentation/InstallUpdatePS ?

Exactly!

> We have a problem in that we cannot run a full-mesh of bandwidth tests
> between all WLCG sites on a short time-scale. We set 28 days as the
> testing interval to provide some measurements while we determine the best
> way forward. We understand this doesn't provide very frequent results and
> in fact has highlighted a shortcoming of the perfSONAR scheduled testing.
> The problem is that the default maintenance cron job on the toolkit
> restarts all services every 24 hours. This "resets" the testing plans and
> any plans with a testing interval greater than 24 hours break.

Indeed, with the current setup the bandwidth node does not produce any
useful results (at least for us as site admins ...). Plots are
completely broken here.

> We are running a full-mesh latency testing which can provide more insight
> into the quality of the various network paths. We are also trying to use
> the loss-data to infer the maximum achievable bandwidth for each path.

Do I understand correctly that the current WLCG mesh is actually too
big? So, site's external network will be mainly saturated by these tests
if a default interval of e.g. 6 hours is being used?

> If I misunderstand your use-case or problem please let me know.

No, you got it completely right! :-)

> We may
> need to go back to smaller targeted (regional?) meshes to provide a
> reasonable level of bandwidth results.

Would it be possible to mention such deficits/issues on the WLCG PS
deployment website? I just assumed another misconfiguration on our side
and wasn't aware of that "design flaw" and its workaround.

Unfortunately the PerfSonar service became quite maintenance intensive
over the last months. Additionally, it's not so easy to find the border
between "vanilla PerfSonar product"-related problems and those that
arise with the "WLCG configuration".

Cheers,
Andreas
--
| Andreas Haupt | E-Mail:

| DESY Zeuthen | WWW: http://www-zeuthen.desy.de/~ahaupt
| Platanenallee 6 | Phone: +49/33762/7-7359
| D-15738 Zeuthen | Fax: +49/33762/7-7216





Archive powered by MHonArc 2.6.16.

Top of Page