Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box


Chronological Thread 
  • From: Trey Dockendorf <>
  • To: Joe Breen <>
  • Cc: Shawn McKee <>, Eli Dart <>, Aaron Brown <>, Brian Tierney <>, perfsonar-user <>
  • Subject: Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box
  • Date: Tue, 27 Jan 2015 14:37:40 -0600

The switch's ports were all MTU 9216 except the two perfsonar boxes (figures, right?).  I've updated the MTU of those two ports and now the ping works for large packet sizes.  I have a feeling most of the recent issues I've had with this host can be attributed to that oversight.  I can now run iperf tests between the two boxes and am seeing ~9.6Gbps.

Thanks for all the support and suggestions ! 

- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Tue, Jan 27, 2015 at 2:28 PM, Joe Breen <> wrote:
Trey,

Though both of your systems are at 9000B MTU, you might still double-check the MTU along the full path with the ping command that Azher suggests (if you have not already).   I have seen MPLS headers, and other tunnel mechanisms in the middle of paths that cause issues. Sometimes, switches that join together vlans do not all have Jumbo configuration of the same size.  Even though both ends have the correct MTU, the full path does not.  Not all intermediate devices deal with things appropriately. 

  --Joe

On Tue, Jan 27, 2015 at 12:35 PM, Trey Dockendorf <> wrote:
Both ends are 9000 MTU

[root@psonar-bwctl ~]# ip link show p1p1
6: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000
    link/ether 90:e2:ba:2e:eb:50 brd ff:ff:ff:ff:ff:ff

[root@psonar-owamp ~]# ip link show p1p1
6: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000
    link/ether 90:e2:ba:2e:ea:04 brd ff:ff:ff:ff:ff:ff

As a test case this afternoon I will be moving both these systems off my local Force10 switch that links to Science DMZ to be directly connected to equipment for our Science DMZ.  This will at least allow our local networking experts to better assist in debugging the problem and rule out a misconfiguration on my local Force10 switch.

Thanks,
- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Tue, Jan 27, 2015 at 1:29 PM, Shawn McKee <> wrote:
Is there an MTU mismatch between the hosts?   

Sender at 9000 and receiver at 1500 and PMTU fails? Initial negotiation will use packets < 1500, but data would by >1500 and if fragmentation is not allowed packets are dropped.

Just a thought since you said 'The transfer started then scp reports "stalled".'

Shawn

On Tue, Jan 27, 2015 at 2:15 PM, Eli Dart <> wrote:
Hi Trey,

If you have root on the suspect box, run tcpdump during a test that fails and see what's going on.

Measurement tools are wonderful and helpful and valuable, but if things are busted enough that the tools can't run, sometimes you just have to watch the packets to figure out what's going on....

Eli



On Tue, Jan 27, 2015 at 11:01 AM, Trey Dockendorf <> wrote:
Transferring a 4.3GB file fails...very bizarre.  The transfer started then scp reports "stalled".

The failure is between the 2 perfsonar boxes.  Transferring to a perfsonar box from a host on our campus LAN and not on our science DMZ works at the expected 1Gbps rate.

Transferring from another science DMZ host (stock CentOS) to the perfsonar box fails.

Transferring from science DMZ to science DMZ, both stock CentOS boxes, works.  So it's only the interactions with perfsonar host that fail.

I'm suspecting something local is broken and just so happens it's effecting interactions to/from my perfsonar boxes.

If there's any suggestions on how to debug I'd be glad to hear them, but seems like something is broken on our local network.

Thanks,
- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Tue, Jan 27, 2015 at 12:46 PM, Aaron Brown <> wrote:
Hey Trey,

That is bizarre. Could you try scp’ing a large file between the two hosts?

Cheers,
Aaron

On Jan 27, 2015, at 1:00 PM, Trey Dockendorf <> wrote:

Yes, it occurs in both directions between psonar-bwctl.brazos.tamu.edu and psonar-owamp.brazos.tamu.edu.

If I run the server side on either of those boxes and try and run the client from a plain CentOS host the tests also don't work.

These systems are both stock net installs of PS 3.4

- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Tue, Jan 27, 2015 at 8:45 AM, Aaron Brown <> wrote:
Hey Trey,

Does this happen in both directions?

Cheers,
Aaron

On Jan 26, 2015, at 7:46 PM, Trey Dockendorf <> wrote:

As the remote end was not setup by me I can't test this particular host, but instead I've tried iperf between my latency PS host and bandwidth PS host and got same results.  Below are results using iperf3 and nuttcp.  With iperf the connection seemed to stall at first then produce 0 bits/sec messages, but with iperf3 it rapidly printed the interval lines.

psonar-bwctl:
# iperf3 -p 5001 -s

psonar-owamp:
# iperf3 -c psonar-bwctl.brazos.tamu.edu -p 5001 -i 1
Connecting to host psonar-bwctl.brazos.tamu.edu, port 5001
[  4] local 165.91.55.4 port 60629 connected to 165.91.55.6 port 5001
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  87.4 KBytes   715 Kbits/sec    2   26.2 KBytes
[  4]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   26.2 KBytes
[  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
[  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1   26.2 KBytes
[  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
[  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   26.2 KBytes
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   26.2 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  87.4 KBytes  71.6 Kbits/sec    5             sender
[  4]   0.00-10.00  sec  0.00 Bytes  0.00 bits/sec                  receiver


Couldn't quite make nuttcp work:

psonar-bwctl:
# nuttcp -1

psonar-owamp:
nuttcp_mread: Bad file descriptor

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Mon, Jan 26, 2015 at 6:33 PM, Brian Tierney <> wrote:

I'm not aware of anything that might explain that. Can you try iperf3 and nuttcp to see if they behave the same?



On Mon, Jan 26, 2015 at 12:22 PM, Trey Dockendorf <> wrote:
Right now most of the endpoints I'm testing against seem to not work with bwctl so to help some colleagues I've been trying to just use iperf by itself.  From my perfsonar boxes the iperf tests seem to do nothing while a non-perfsonar host with iperf installed from EPEL gives expected output.  The results are below.  I've removed the remote information as these were not against a perfsonar box but a remote site's cluster login node.

Is there something about iperf on a PS host that could cause this issue?  The 2 systems below are on same network and same core switch.

PERFSONAR:
# iperf -c <REMOTE HOST> -p 50100 -t 20 -i 1
------------------------------------------------------------
Client connecting to <REMOTE HOST>, TCP port 50100
TCP window size: 92.6 KByte (default)
------------------------------------------------------------
[  3] local 165.91.55.6 port 33135 connected with <REMOTE IP> port 50100
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec   175 KBytes  1.43 Mbits/sec
[  3]  1.0- 2.0 sec  0.00 Bytes  0.00 bits/sec
[  3]  2.0- 3.0 sec  0.00 Bytes  0.00 bits/sec
<Repeated 100s of times with 0.00 bits/sec>
[  3] 931.0-932.0 sec  0.00 Bytes  0.00 bits/sec

NON-PERFSONAR:

$ iperf -c <REMOTE HOST> -p 50100 -t 20 -i 1
------------------------------------------------------------
Client connecting to <REMOTE HOST>, TCP port 50100
TCP window size: 92.6 KByte (default)
------------------------------------------------------------
[  3] local 165.91.55.28 port 51252 connected with <REMOTE IP> port 50100
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  60.5 MBytes   508 Mbits/sec
[  3]  1.0- 2.0 sec  30.5 MBytes   256 Mbits/sec
[  3]  2.0- 3.0 sec  32.9 MBytes   276 Mbits/sec
[  3]  3.0- 4.0 sec  36.2 MBytes   304 Mbits/sec
[  3]  4.0- 5.0 sec  36.6 MBytes   307 Mbits/sec
[  3]  5.0- 6.0 sec  25.1 MBytes   211 Mbits/sec
[  3]  6.0- 7.0 sec  27.2 MBytes   229 Mbits/sec
[  3]  7.0- 8.0 sec  33.5 MBytes   281 Mbits/sec
[  3]  8.0- 9.0 sec  32.2 MBytes   271 Mbits/sec
[  3]  9.0-10.0 sec  31.9 MBytes   267 Mbits/sec
[  3] 10.0-11.0 sec  24.9 MBytes   209 Mbits/sec
[  3] 11.0-12.0 sec  29.9 MBytes   251 Mbits/sec
[  3] 12.0-13.0 sec  36.8 MBytes   308 Mbits/sec
[  3] 13.0-14.0 sec  33.0 MBytes   277 Mbits/sec
[  3] 14.0-15.0 sec  21.6 MBytes   181 Mbits/sec
[  3] 15.0-16.0 sec  16.6 MBytes   139 Mbits/sec
[  3] 16.0-17.0 sec  22.0 MBytes   185 Mbits/sec
[  3] 17.0-18.0 sec  23.6 MBytes   198 Mbits/sec
[  3] 18.0-19.0 sec  23.5 MBytes   197 Mbits/sec
[  3] 19.0-20.0 sec  19.2 MBytes   161 Mbits/sec
[  3]  0.0-20.0 sec   598 MBytes   250 Mbits/sec

Thanks,
- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Email:  
Jabber:



--
Brian Tierney, http://www.es.net/tierney
Energy Sciences Network (ESnet), Berkeley National Lab
http://fasterdata.es.net









--
Eli Dart, Network Engineer                          NOC: (510) 486-7600
ESnet Office of the CTO (AS293)                          (800) 333-7638
Lawrence Berkeley National Laboratory 
PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3







Archive powered by MHonArc 2.6.16.

Top of Page