Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Throughput suddenly unidirectional

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Throughput suddenly unidirectional


Chronological Thread 
  • From: Jason Zurawski <>
  • To: Daniel Schmidt <>
  • Cc:
  • Subject: Re: [perfsonar-user] Throughput suddenly unidirectional
  • Date: Tue, 2 Dec 2014 12:36:58 -0500

Hey Dan;

Looking through the logs, the only suspect thing I see are lines of this
nature:

> Dec 2 10:05:01 localhost bwctld[12565]: FILE=endpoint.c, LINE=1314,
> PeerAgent: Peer cancelled test before expected

Unfortunately that tells us the ‘what’ but not the ‘how’. Could you also
send the logs from the other host you are using? That host may have more
details about what is going on. Couple other things that came to mind:

>> * No firewall between A & B

IPTables may be on for both sides, it may be a quick and dirty test to just
disable that to see if that helps?

>> * I'm not familiar with "slots." There are few throughput tests running
>> though. (Tests running 33% of time)

Ok, this won’t be the issue I was thinking of.

>> * I assumed packet loss was an issue. So, I setup smokeping on both
>> sides, 5 every 30 seconds, 1472 MTU. However, I'm not getting loss.

Do you have OWAMP going between the two hosts? If you don’t, I would suggest
setting up that test too. OWAMP uses UDP packets which may give a different
clue than the ICMP that smokeping would use.

>> * PsPerformance comes with ntp on - appears to be running, they have the
>> same time & these machines are not behind any firewalls.

Could you send the output of ‘ntpq -p -c rv’ for both?

>> * I am not seeing the issue on command line bwctl. Strange.

Could you try the reverse direction as well - e.g. swap the hosts for the -c
and -s flags? Also try using ‘iperf’ and ‘nuttcp’ as the tool instead of
‘iperf3’.

Thanks;

-jason

On Dec 2, 2014, at 12:16 PM, Daniel Schmidt
<>
wrote:

> Thank you kindly for your reply. Some short responses:
>
> * No firewall between A & B
> * I'm not familiar with "slots." There are few throughput tests running
> though. (Tests running 33% of time)
> * I assumed packet loss was an issue. So, I setup smokeping on both sides,
> 5 every 30 seconds, 1472 MTU. However, I'm not getting loss.
> * PsPerformance comes with ntp on - appears to be running, they have the
> same time & these machines are not behind any firewalls.
> * I am not seeing the issue on command line bwctl. Strange.
> * Cacti minute graphs don't show any strange usage on the ICX switch.
>
> I would suspect hardware, but the boxes ran a solid a line for hours on my
> bench test. Please forgive me, but I'm reluctant to give the IP's as I
> haven't really figured out how I would prevent hackers from using these
> machines to DOS me. (Does anybody have to mitigate this issue? Sorry -
> off topic question) However, I'd be happy to privately give you root on
> the box.
>
> I have attached a png of what I see. You can see the lines greatly vary
> greatly and around 9:30 the thruput suddenly decided to start working
> again. I have also attached the log, replacing 1.1.1.1 for local and
> 2.2.2.2 for remote.
>
> Many thanks,
> -Dan
>
> On Mon, Dec 1, 2014 at 4:24 PM, Jason Zurawski
> <>
> wrote:
> Hey Daniel;
>
> Would you be able to provide a link to your node, or send along a
> screenshot, to give us a better idea of what you are seeing?
>
> Off the top of my head, here are a couple of typical reasons that tests
> could fail:
>
> - Firewalls in the path denying access to ports, or not enough
> ports available for the number of tests that are running
>
> - Lack of testing ‘slots’ available on one side or the other
>
> - NTP synchronization issues
>
> - Packet loss that prevents the test from starting or finishing.
>
> If you send along your /var/log/perfsonar/owamp_bwctl.log file, we can have
> a look to see what may be menacing your node. The other thing you can try
> is some by-hand tests, something like:
>
> bwctl -f m -x -T iperf3 -t 30 -i 1 -c HOST1 -s HOST2
>
> Thanks;
>
> -jason
>
> On Dec 1, 2014, at 5:43 PM, Daniel Schmidt
> <>
> wrote:
>
> > I've noticed strange behavior on our throughput tests at one site.
> > Sometimes, the graph turn unidirectional - ie, one way stops working.
> > Sometimes, both ways will stop working. The times are random. Although
> > the site is verified up by ping and passes traffic, however the
> > throughput graphs vary greatly. (We believe due to issues with this
> > circuit)
> >
> > I've only seen it do this in this one case. It's almost like it gets
> > angry that the speed varies vastly and gives up.
> >
> > Has anybody else encountered this? Any ideas greatly appreciated.



Archive powered by MHonArc 2.6.16.

Top of Page