Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] meshconfig bandwidth tests default interval

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] meshconfig bandwidth tests default interval


Chronological Thread 
  • From: Shawn McKee <>
  • To: Andreas Haupt <>
  • Cc: Marian Babik <>, perfsonar-user <>
  • Subject: Re: [perfsonar-user] meshconfig bandwidth tests default interval
  • Date: Mon, 7 Mar 2016 11:16:40 -0500

Hi Andreas,

Yes, it is a good suggestion to add some text to the deployment page about our current configuration.  I have updated the page at https://twiki.opensciencegrid.org/bin/view/Documentation/InstallUpdatePS#NoteOnBandwidthMesh

Let me know if you have suggested edits there.

I would also note that this information is presented and discussed at our WLCG Network and Transfer Metrics WG meetings.    The change to the bandwidth meshes was discussed at couple different meetings and formally included in the Sep 30, 2015 meet ( see https://indico.cern.ch/event/400643/attachments/1162923/1675228/metrics_wg_30th_Sept.pdf )    You can see details on the group at https://twiki.cern.ch/twiki/bin/view/LCG/NetworkTransferMetrics     We discuss all the policy issues, plans and changes during those meetings and they are open to anyone who wants to attend. (I know attending yet another meeting is what everyone wants to do!)

I would like to get more details on your last paragraph:

"Unfortunately the PerfSonar service became quite maintenance intensive
over the last months. Additionally, it's not so easy to find the border
between "vanilla PerfSonar product"-related problems and those that
arise with the "WLCG configuration". "

One of our goals is to minimize the amount of maintenance the deployed WLCG/OSG toolkits require.   Can you let us know in what ways things have changed regarding maintenance at your site?   I would like to make sure we work on addressing problems so that other sites don't have to run into the same problems.

Thanks for your feedback.

Shawn



On Mon, Mar 7, 2016 at 8:30 AM, Andreas Haupt <> wrote:
Hi Shawn,

(just added perfsonar-user list again)

On Mon, 2016-03-07 at 07:09 -0500, Shawn McKee wrote:
> Hi Andreas,
>
> I think this is all in regards to the OSG/WLCG perfSONAR bandwidth mesh?  I
> assume your nodes are using the OSG auto-mesh URL as described in
> https://twiki.opensciencegrid.org/bin/view/Documentation/InstallUpdatePS ?

Exactly!

> We have a problem in that we cannot run a full-mesh of bandwidth tests
> between all WLCG sites on a short time-scale.   We set 28 days as the
> testing interval to provide some measurements while we determine the best
> way forward.  We understand this doesn't provide very frequent results and
> in fact has highlighted a shortcoming of the perfSONAR scheduled testing.
>  The problem is that the default maintenance cron job on the toolkit
> restarts all services every 24 hours.  This "resets" the testing plans and
> any plans with a testing interval greater than 24 hours break.

Indeed, with the current setup the bandwidth node does not produce any
useful results (at least for us as site admins ...). Plots are
completely broken here.

> We are running a full-mesh latency testing which can provide more insight
> into the quality of the various network paths.  We are also trying to use
> the loss-data to infer the maximum achievable bandwidth for each path.

Do I understand correctly that the current WLCG mesh is actually too
big? So, site's external network will be mainly saturated by these tests
if a default interval of e.g. 6 hours is being used?

> If I misunderstand your use-case or problem please let me know.

No, you got it completely right! :-)

>     We may
> need to go back to smaller targeted (regional?) meshes to provide a
> reasonable level of bandwidth results.

Would it be possible to mention such deficits/issues on the WLCG PS
deployment website? I just assumed another misconfiguration on our side
and wasn't aware of that "design flaw" and its workaround.

Unfortunately the PerfSonar service became quite maintenance intensive
over the last months. Additionally, it's not so easy to find the border
between "vanilla PerfSonar product"-related problems and those that
arise with the "WLCG configuration".

Cheers,
Andreas
--
| Andreas Haupt             | E-Mail:
|  DESY Zeuthen             | WWW:    http://www-zeuthen.desy.de/~ahaupt
|  Platanenallee 6          | Phone:  +49/33762/7-7359
|  D-15738 Zeuthen          | Fax:    +49/33762/7-7216






Archive powered by MHonArc 2.6.16.

Top of Page