Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?


Chronological Thread 
  • From: "Cottrell, Les" <>
  • To: Alan Whinery <>, "" <>
  • Subject: RE: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?
  • Date: Fri, 9 Jan 2015 00:35:14 +0000
  • Accept-language: en-US

We have data from PingER going back to 2005 which I have mined to look for DUP's. The input data is one line per set of pings made from SLAC to a remote host. The line indicates whether there were DUP’s.

Each line is for a remote host  monitored from SLAC with up to 10 successful  (with a cut off at 30 tries)  100 Byte pings each 30 mins.  

 

The monitoring host (pinger.slac.stanford.edu) is not using multicast, the interfaces are not bonded. It is running Linux:

Linux pinger 2.6.32-279.19.1.el6.i686 #1 SMP Sat Nov 24 14:42:18 EST 2012 i686 i686 i386 GNU/Linux

The current number of countries with remote hosts being monitored is  171.

 

The major contributor since 2006 has been www.cern.ch, each year we see about 60 different hosts responding with DUPs.

 

A list of the hosts responding with DUP pings in 2014 is below, followed by a summary of the annual DUP pings.

 

There is some rough documentation at https://confluence.slac.stanford.edu/display/IEPM/Duplicate+packets

 

I do not claim to know the cause and would be delighted to be educated.   Maybe somebody at CERN can cast light on their web host.

 

2014: cbinet.bi seen duping 35 times

2014: internet.fo seen duping 6 times

2014: ms01.linea.gov.br seen duping 1 times

2014: ns.conacyt.gob.sv seen duping 3 times

2014: ping.cern.ch seen duping 450 times

2014: pinger.daffodilvarsity.edu.bd seen duping 2 times

2014: pinger.ictp.it seen duping 1 times

2014: pinger.numl.edu.pk seen duping 96 times

2014: pingerqta.pern.edu.pk seen duping 1 times

2014: www.acgrc.am seen duping 2 times

2014: www.afribone.net.gn seen duping 2 times

2014: www.afrinet.cd seen duping 8 times

2014: www.aic.ac.nz seen duping 1 times

2014: www.alard.ps seen duping 1 times

2014: www.bonesha.bi seen duping 73 times

2014: www.boz.zm seen duping 4 times

2014: www.cern.ch seen duping 14026 times

2014: www.cnrst.bf seen duping 43 times

2014: www.cyfronet.krakow.pl seen duping 1 times

2014: www.drtvnet.cg seen duping 12 times

2014: www.eritel.com.er seen duping 373 times

2014: www.gov.bw seen duping 635 times

2014: www.granma.cu seen duping 9 times

2014: www.hraparak.am seen duping 1 times

2014: www.ihep.su seen duping 1 times

2014: www.kcn.unima.mw seen duping 118 times

2014: www.lanl.gov seen duping 1 times

2014: www.lonab.bf seen duping 394 times

2014: www.lsx.com.la seen duping 1 times

2014: www.minzdrav.uz seen duping 12 times

2014: www.ml.refer.org seen duping 1 times

2014: www.nomad.mu seen duping 1 times

2014: www.rmutsv.ac.th seen duping 4 times

2014: www.rub.edu.bt seen duping 1 times

2014: www.stmaryuniversitycollege.edu.et seen duping 1 times

2014: www.uniswafoundation.org.sz seen duping 1 times

2014: www.univ-koudougou.bf seen duping 632 times

2014: www.univ-ouaga.bf seen duping 7 times

2014: www.uns.ac.id seen duping 2 times

2014: www.vnu.edu.vn seen duping 26 times

2014: www.vnuhcm.edu.vn seen duping 1 times

 

Year

DUPs

Hosts DUPing

Hosts monitored

Samples

%

CERN

Diff

% CERN

2005

93

27

481

11408092

0.0008%

0

93

0.00%

2006

9228

40

514

13715929

0.0673%

5751

3477

62.32%

2007

35673

42

541

16315320

0.2186%

34721

952

97.33%

2008

39262

57

592

19680482

0.1995%

34249

5013

87.23%

2009

42356

52

663

17889767

0.2368%

27469

14887

64.85%

2010

74638

51

623

19862304

0.3758%

19693

54945

26.38%

2011

30769

79

659

22889278

0.1344%

22518

8251

73.18%

2012

85217

50

797

23786399

0.3583%

34402

50815

40.37%

2013

74128

76

774

25475771

0.2910%

34916

39212

47.10%

2014

16990

41

836

29933696

0.0568%

14026

2964

82.55%

2015

164

4

514773

0.0319%

104

60

63.41%

 

 

 

 

 

 

 

 

 

 

 

 

 

-----Original Message-----
From: [mailto:] On Behalf Of Alan Whinery
Sent: Wednesday, January 07, 2015 9:57 AM
To:
Subject: Re: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?

 

I would say that there are fewer dups in the world than there used to be. I once found the root cause as a bug in a Proteon FDDI driver, which was causing its router to both forward and issue a spurious ICMP redirect.  Mayhem ensued.

 

Maybe Les's data has the reach to comment on my "fewer dups than there used to be" assertion. Of course, maybe I just don't ping as much as I once did. Duplicates have always been a bit of a "mystic lore" sort of topic, probably because there aren't enough of them to become a priority, or the bugs get found and fixed without any fanfare.

 

On 1/7/2015 7:38 AM, Cottrell, Les wrote:

> This has been going on for web.cern.ch for years. It does not happen for other CERN nodes such as pinger.cern.ch. The earliest reference I can find is 12/30/20006. It was reported to CERN network folks at the time.  I have used it as an example in some courses I have given.  Several possibilities have been proposed and some shot down, however, I never heard a definite reason given.

> 

> -----Original Message-----

> From:

> [] On Behalf Of Eli Dart

> Sent: Wednesday, January 07, 2015 9:18 AM

> To: SCHAER Frederic

> Cc:

> Subject: Re: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?

> 

> Hi Frederic,

> 

> I don't know if this is expected behavior or not - I'm not a load balancer expert.  It is possible that I'm wrong about this too - we'd have to ask someone at CERN to be sure.

> 

> Still, I would say that this behavior is less than optimal.  It may be something that is easy to fix, or it may be a known consequence of engineering decisions made to achieve the best aggregate outcome.  Hard to say for sure from my current perspective.

> 

> Eli

> 

> 

> On Wed, Jan 7, 2015 at 9:01 AM, SCHAER Frederic <> wrote:

> 

> 

>             Hi Eli,

> 

>              

> 

>             Oh… if you’re right then I’m wondering (also) if that behaviour is to

> be expexted or not…

> 

>              

> 

>             De : Eli Dart []

>             Envoyé : mercredi 7 janvier 2015 17:57

>             À : SCHAER Frederic

>             Cc :

>             Objet : Re: [perfsonar-user] owamp/traceroute results on ps-PS 3.4.1 ?

> 

>              

> 

>             Hi Frederic,

> 

>              

> 

>             I see the duplicate ping replies from CERN as well (using two different hosts on two different networks).

> 

>              

> 

>             Given that www.cern.ch <http://www.cern.ch>  is an alias for webrlb02.cern.ch, I'm guessing that CERN is using a load balancer appliance (I expect that the lb02 in the hostname means Load Balancer Number Two).  I've seen this behavior in some load balancers before.

> 

>              

> 

>             This doesn't answer your perfSONAR question, of course...

> 

>              

> 

>             Eli

> 

>              

> 

>              

> 

>             On Wed, Jan 7, 2015 at 8:49 AM, SCHAER Frederic <> wrote:

> 

>                             Hi,

> 

>                              

> 

>                             I found an issue which I’d like to see if it’s been found by perfsonar owamp tools.. and I’d like to know when it started. But I can’t find any data result despite we have regional/intersites meshes setup.

> 

>                             The tool named traceroute graphs/“psTracerouteViewer v2” attempts to query the perfsonar host on http://localhost/esmond/perfsonar/archive/ and finds nothing.

> 

>                             Attemting to use the fulle hostname instead of localhost does not

> help, but going to

> http://full.hostname/esmond/perfsonar/archive/?format=json

> <http://full.hostname/esmond/perfsonar/archive/?format=json>  displays

> stuff…

> 

>                              

> 

>                             Any idea how I could see owamp/traceroute results ?

> 

>                              

> 

>                             For information, the issue I found at random is that I see duplicate ping packets to cern :

> 

>                              

> 

>                             [fschaer@node02 irfu]$ ping www.cern.ch <http://www.cern.ch>

> 

>                             PING webrlb02.cern.ch (188.184.9.235) 56(84) bytes of data.

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=1 ttl=117 time=9.55 ms

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=1 ttl=117 time=9.58 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=1 ttl=117 time=9.58 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=2 ttl=117 time=8.91 ms

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=2 ttl=117 time=8.93 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=2 ttl=117 time=8.94 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=3 ttl=117 time=9.01 ms

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=3 ttl=117 time=9.03 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=3 ttl=117 time=9.04 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=4 ttl=117 time=8.89 ms

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=4 ttl=117 time=8.90 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=4 ttl=117 time=8.90 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=5 ttl=117 time=9.00 ms

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch>  

> (188.184.9.235): icmp_seq=5 ttl=117 time=9.04 ms (DUP!)

> 

>                             64 bytes from webrlb02.cern.ch <http://webrlb02.cern.ch

> (188.184.9.235): icmp_seq=5 ttl=117 time=9.04 ms (DUP!)

> 

>                             ^C

> 

>                             --- webrlb02.cern.ch ping statistics ---

> 

>                             5 packets transmitted, 5 received, +10 duplicates, 0% packet loss,

> time 4420ms

> 

>                              

> 

>                             Note : I don’t think cern is part of the tests/meshes, but question remains : maybe cern isn’t the only route affected with this issue on our side.

> 

>                              

> 

>                             Thanks

> 

>            

>            

>            

> 

>              

> 

>             --

> 

>             Eli Dart, Network Engineer                          NOC: (510) 486-7600 <tel:%28510%29%20486-7600>

> 

>             ESnet Office of the CTO (AS293)                          (800) 333-7638 <tel:%28800%29%20333-7638>

> 

>             Lawrence Berkeley National Laboratory

> 

>             PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82

> B2B3

> 

> 

> 

> 

 

 




Archive powered by MHonArc 2.6.16.

Top of Page