perfsonar-user - Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box
Subject: perfSONAR User Q&A and Other Discussion
List archive
Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box
Chronological Thread
- From: Trey Dockendorf <>
- To: Matthew J Zekauskas <>
- Cc: perfsonar-user <>
- Subject: Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box
- Date: Tue, 27 Jan 2015 14:12:45 -0600
The test suggested by Azher fails.
[root@psonar-bwctl ~]# ping -Mdo -s 8972 165.91.55.4
PING 165.91.55.4 (165.91.55.4) 8972(9000) bytes of data.
<hang>
[root@psonar-bwctl ~]# ping -Mdo -s 1500 165.91.55.4
PING 165.91.55.4 (165.91.55.4) 1500(1528) bytes of data.
1508 bytes from 165.91.55.4: icmp_seq=1 ttl=64 time=0.189 ms
Both these hosts are connected to the same Force10 switch. Is this MTU block hole something that can be the result of a switch not correctly configured to handle MTU of 9000?
Thanks,
- Trey
=============================
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email:
Jabber:
On Tue, Jan 27, 2015 at 1:58 PM, Matthew J Zekauskas <> wrote:
[popping up head out of sand momentarily]
Without having read the whole thread in detail, this feels like an MTU black hole. If both ends think that the MTU is 9000, and successfully negotiate a large MSS, and then actually try to use it, but some element in the middle drops large MTU packets.... then a session will start, but then hang as the large packet sent and repeatedly retransmitted but dropped.
The test Azher suggests (ping with large packets) is an easy way to see if large packets make it through successfully. It should be sufficient to do something large but not full sized (e.g. 8192) to see. However, if there is a tunnel in the middle, then the tunnel can also push a full size packet over the MTU, so if the large packet succeeds it probably still pays to try to do the math (or just test using some sort of search strategy) to send a full size packet and see if it makes it through.
I suppose using something like tracepath could also show this (although if there is a black hole then the trace would just fail in the middle somewhere instead of adjusting MTU lower).
--Matt
[back to sand]
On 1/27/15 2:35 PM, Trey Dockendorf wrote:
Both ends are 9000 MTU
[root@psonar-bwctl ~]# ip link show p1p16: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000link/ether 90:e2:ba:2e:eb:50 brd ff:ff:ff:ff:ff:ff
[root@psonar-owamp ~]# ip link show p1p16: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000link/ether 90:e2:ba:2e:ea:04 brd ff:ff:ff:ff:ff:ff
As a test case this afternoon I will be moving both these systems off my local Force10 switch that links to Science DMZ to be directly connected to equipment for our Science DMZ. This will at least allow our local networking experts to better assist in debugging the problem and rule out a misconfiguration on my local Force10 switch.
Thanks,- Trey
=============================
Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:
On Tue, Jan 27, 2015 at 1:29 PM, Shawn McKee <> wrote:
Is there an MTU mismatch between the hosts?
Sender at 9000 and receiver at 1500 and PMTU fails? Initial negotiation will use packets < 1500, but data would by >1500 and if fragmentation is not allowed packets are dropped.
Just a thought since you said 'The transfer started then scp reports "stalled".'
Shawn
On Tue, Jan 27, 2015 at 2:15 PM, Eli Dart <> wrote:
Hi Trey,
If you have root on the suspect box, run tcpdump during a test that fails and see what's going on.
Measurement tools are wonderful and helpful and valuable, but if things are busted enough that the tools can't run, sometimes you just have to watch the packets to figure out what's going on....
Eli
--
On Tue, Jan 27, 2015 at 11:01 AM, Trey Dockendorf <> wrote:
Transferring a 4.3GB file fails...very bizarre. The transfer started then scp reports "stalled".
The failure is between the 2 perfsonar boxes. Transferring to a perfsonar box from a host on our campus LAN and not on our science DMZ works at the expected 1Gbps rate.
Transferring from another science DMZ host (stock CentOS) to the perfsonar box fails.
Transferring from science DMZ to science DMZ, both stock CentOS boxes, works. So it's only the interactions with perfsonar host that fail.
I'm suspecting something local is broken and just so happens it's effecting interactions to/from my perfsonar boxes.
If there's any suggestions on how to debug I'd be glad to hear them, but seems like something is broken on our local network.
Thanks,- Trey
=============================
Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:
On Tue, Jan 27, 2015 at 12:46 PM, Aaron Brown <> wrote:
Hey Trey,
That is bizarre. Could you try scp’ing a large file between the two hosts?
Cheers,Aaron
On Jan 27, 2015, at 1:00 PM, Trey Dockendorf <> wrote:
Yes, it occurs in both directions between psonar-bwctl.brazos.tamu.edu and psonar-owamp.brazos.tamu.edu.
If I run the server side on either of those boxes and try and run the client from a plain CentOS host the tests also don't work.
These systems are both stock net installs of PS 3.4
- Trey
=============================
Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:
On Tue, Jan 27, 2015 at 8:45 AM, Aaron Brown <> wrote:
Hey Trey,
Does this happen in both directions?
Cheers,Aaron
On Jan 26, 2015, at 7:46 PM, Trey Dockendorf <> wrote:
As the remote end was not setup by me I can't test this particular host, but instead I've tried iperf between my latency PS host and bandwidth PS host and got same results. Below are results using iperf3 and nuttcp. With iperf the connection seemed to stall at first then produce 0 bits/sec messages, but with iperf3 it rapidly printed the interval lines.
psonar-bwctl:# iperf3 -p 5001 -s
psonar-owamp:# iperf3 -c psonar-bwctl.brazos.tamu.edu -p 5001 -i 1Connecting to host psonar-bwctl.brazos.tamu.edu, port 5001[ 4] local 165.91.55.4 port 60629 connected to 165.91.55.6 port 5001[ ID] Interval Transfer Bandwidth Retr Cwnd[ 4] 0.00-1.00 sec 87.4 KBytes 715 Kbits/sec 2 26.2 KBytes[ 4] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 1 26.2 KBytes[ 4] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes[ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 1 26.2 KBytes[ 4] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes[ 4] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes[ 4] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 1 26.2 KBytes[ 4] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes[ 4] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes[ 4] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 0 26.2 KBytes- - - - - - - - - - - - - - - - - - - - - - - - -[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-10.00 sec 87.4 KBytes 71.6 Kbits/sec 5 sender[ 4] 0.00-10.00 sec 0.00 Bytes 0.00 bits/sec receiver
Couldn't quite make nuttcp work:
psonar-bwctl:# nuttcp -1
psonar-owamp:# nuttcp -b psonar-bwctl.brazos.tamu.edunuttcp_mread: Bad file descriptor
=============================
Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:
On Mon, Jan 26, 2015 at 6:33 PM, Brian Tierney <> wrote:
I'm not aware of anything that might explain that. Can you try iperf3 and nuttcp to see if they behave the same?
--
On Mon, Jan 26, 2015 at 12:22 PM, Trey Dockendorf <> wrote:
Right now most of the endpoints I'm testing against seem to not work with bwctl so to help some colleagues I've been trying to just use iperf by itself. From my perfsonar boxes the iperf tests seem to do nothing while a non-perfsonar host with iperf installed from EPEL gives expected output. The results are below. I've removed the remote information as these were not against a perfsonar box but a remote site's cluster login node.
Is there something about iperf on a PS host that could cause this issue? The 2 systems below are on same network and same core switch.
PERFSONAR:# iperf -c <REMOTE HOST> -p 50100 -t 20 -i 1------------------------------------------------------------Client connecting to <REMOTE HOST>, TCP port 50100TCP window size: 92.6 KByte (default)------------------------------------------------------------[ 3] local 165.91.55.6 port 33135 connected with <REMOTE IP> port 50100[ ID] Interval Transfer Bandwidth[ 3] 0.0- 1.0 sec 175 KBytes 1.43 Mbits/sec[ 3] 1.0- 2.0 sec 0.00 Bytes 0.00 bits/sec[ 3] 2.0- 3.0 sec 0.00 Bytes 0.00 bits/sec<Repeated 100s of times with 0.00 bits/sec>[ 3] 931.0-932.0 sec 0.00 Bytes 0.00 bits/sec
NON-PERFSONAR:
$ iperf -c <REMOTE HOST> -p 50100 -t 20 -i 1------------------------------------------------------------Client connecting to <REMOTE HOST>, TCP port 50100TCP window size: 92.6 KByte (default)------------------------------------------------------------[ 3] local 165.91.55.28 port 51252 connected with <REMOTE IP> port 50100[ ID] Interval Transfer Bandwidth[ 3] 0.0- 1.0 sec 60.5 MBytes 508 Mbits/sec[ 3] 1.0- 2.0 sec 30.5 MBytes 256 Mbits/sec[ 3] 2.0- 3.0 sec 32.9 MBytes 276 Mbits/sec[ 3] 3.0- 4.0 sec 36.2 MBytes 304 Mbits/sec[ 3] 4.0- 5.0 sec 36.6 MBytes 307 Mbits/sec[ 3] 5.0- 6.0 sec 25.1 MBytes 211 Mbits/sec[ 3] 6.0- 7.0 sec 27.2 MBytes 229 Mbits/sec[ 3] 7.0- 8.0 sec 33.5 MBytes 281 Mbits/sec[ 3] 8.0- 9.0 sec 32.2 MBytes 271 Mbits/sec[ 3] 9.0-10.0 sec 31.9 MBytes 267 Mbits/sec[ 3] 10.0-11.0 sec 24.9 MBytes 209 Mbits/sec[ 3] 11.0-12.0 sec 29.9 MBytes 251 Mbits/sec[ 3] 12.0-13.0 sec 36.8 MBytes 308 Mbits/sec[ 3] 13.0-14.0 sec 33.0 MBytes 277 Mbits/sec[ 3] 14.0-15.0 sec 21.6 MBytes 181 Mbits/sec[ 3] 15.0-16.0 sec 16.6 MBytes 139 Mbits/sec[ 3] 16.0-17.0 sec 22.0 MBytes 185 Mbits/sec[ 3] 17.0-18.0 sec 23.6 MBytes 198 Mbits/sec[ 3] 18.0-19.0 sec 23.5 MBytes 197 Mbits/sec[ 3] 19.0-20.0 sec 19.2 MBytes 161 Mbits/sec[ 3] 0.0-20.0 sec 598 MBytes 250 Mbits/sec
Thanks,- Trey
=============================
Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:
Brian Tierney, http://www.es.net/tierney
Energy Sciences Network (ESnet), Berkeley National Lab
http://fasterdata.es.net
Eli Dart, Network Engineer NOC: (510) 486-7600ESnet Office of the CTO (AS293) (800) 333-7638Lawrence Berkeley National LaboratoryPGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, (continued)
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Aaron Brown, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Aaron Brown, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Eli Dart, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Shawn McKee, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Azher Mughal, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Matthew J Zekauskas, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Mark Foster, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Eli Dart, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Joe Breen, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Aaron Brown, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Aaron Brown, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Eli Dart, 01/27/2015
- Re: [perfsonar-user] iperf not usable from perfsonar box, works from CentOS non-PS box, Trey Dockendorf, 01/27/2015
Archive powered by MHonArc 2.6.16.