perfsonar-user - [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?
Subject: perfSONAR User Q&A and Other Discussion
List archive
[perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?
Chronological Thread
- From: SCHAER Frederic <>
- To: perfsonar-user <>
- Subject: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?
- Date: Wed, 5 Oct 2016 13:56:19 +0000
- Accept-language: fr-FR, en-US
- Ironport-phdr: 9a23:1Ar84heV46mPO3yg2vYbxvy2lGMj4u6mDksu8pMizoh2WeGdxc2zZh7h7PlgxGXEQZ/co6odzbGJ4+a9AidZvN6oizMrTt9lb1c9k8IYnggtUoauKHbQC7rUVRE8B9lIT1R//nu2YgB/Ecf6YEDO8DXptWZBUka3CQ0gPunvFJXVic2tkv2p9oebNx1FnjSmZrV7NlCrtgjLnsgQnYZ4LKstkF3ErmYeKMpMwmY9HnmztjvGrumq8ZJu6SVb86Yv7cNMXL/7dOIjRrxRAS4OOG08osPx40qQBTCT72cRBz1F2iFDBBLIuUn3
Hi, I’m trying to determine if we really can use all the available bandwidth on our paths (and if the v6 bandwidth is equivalent to that of the v4, but nevermind). I tried to run some transfers between my site and a few others (and I tried 3rd party transfers too), using the LHCONE network. According to the network traffic graphs here, the links are far from being overloaded. of the 20gbits/s available, 6 to 8gbits/s are used. My own connection is 10gbits/s only. I setup a perfsonar host with a 10gbits/s network card, plug that on a force10 switch, which is directly connected using 2x40gbits/s links to the main switch itself connected to the (dedicated) router. So, that’s : PERFSONAR 10Gbits => SWITCH => 80Gbits => SWITCH => 10Gbits/s => Router => LHCONE+internet With this setup, and with a relatively free network , I usually cannot reach more than 3 or 4gbits/s with a bwctl/iperf3 test, using even as many as 80 parallel transfers. With a single transfer, bandwidth can be as low
as 700mbits/s, and I’m seeing TCP retransmits in all cases. A summary of this is the iperf3 output : [ 71] 0.00-30.00 sec 504 MBytes 141 Mbits/sec 334 sender [ 71] 0.00-30.00 sec 504 MBytes 141 Mbits/sec receiver [ 73] 0.00-30.00 sec 514 MBytes 144 Mbits/sec 373 sender [ 73] 0.00-30.00 sec 513 MBytes 144 Mbits/sec receiver [SUM] 0.00-30.00 sec 16.0 GBytes 4570 Mbits/sec 11105 sender [SUM] 0.00-30.00 sec 15.9 GBytes 4566 Mbits/sec receiver CPU Utilization: local/sender 85.2% (4.3%u/80.9%s), remote/receiver 71.1% (2.9%u/68.1%s) As you can see there were 11K+ retransmits during the 30s transfer. The command was: bwctl -4 -v -r -s <source> -c <destination> -t 30 -i 1 -T iperf3 -P 30 (in that case, the source was my host) I’m therefore wondering where I could possibly be wrong ? I tried to optimize the kernel parameters according to the ESnet tuning guides, but this did not change much. The destination host seems quite close thanks to LHCONE : rtt min/avg/max/mdev = 6.361/6.384/6.409/0.015 ms The sysctl params are : net.core.rmem_max=134217728 net.core.wmem_max=134217728 net.ipv4.tcp_rmem=4096 87380 67108864 net.ipv4.tcp_wmem=4096 65536 67108864 net.core.netdev_max_backlog=250000 net.ipv4.tcp_no_metrics_save=1 net.ipv4.tcp_congestion_control=htcp net.ipv4.conf.all.arp_ignore=1 net.ipv4.conf.all.arp_announce=2 net.ipv4.conf.default.arp_filter=1 net.ipv4.conf.all.arp_filter=1 net.ipv4.tcp_max_syn_backlog=30000 net.ipv4.conf.all.accept_redirects=0 net.ipv4.udp_rmem_min=8192 net.ipv4.tcp_tw_recycle=1 net.core.rmem_default=67108864 net.ipv4.tcp_tw_reuse=1 net.core.optmem_max=134217728 net.ipv4.tcp_slow_start_after_idle=0 net.core.wmem_default=67108864 net.ipv4.conf.all.send_redirects=0 net.ipv4.conf.all.accept_source_route=0 net.ipv4.tcp_mtu_probing=1 net.core.somaxconn=1024 net.ipv4.tcp_max_tw_buckets=2000000 vm.vfs_cache_pressure=1 net.ipv4.tcp_fin_timeout=10 net.ipv4.udp_wmem_min=8192 Any idea why the iperf3 transfers do not reach high bandwidth even with a single thread ? Off course, I don’t know where the destination perfsonar hosts are behind their routers (far or near, behind loaded networks or not…), and I know that the 20gbits/s link is a 2x10, but with that in mind, even a 2 threads
transfer should be able to use a lot of bandwidth, not just 4 gbits in the best case ? Also, why are there TCP retransmits when the links aren’t loaded (according to the network graphs, I don’t have access to the NOC interfaces counters ;) ) ? Ideas ? Regards |
- [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, SCHAER Frederic, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, Shawn McKee, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, Philip Papadopoulos, 10/05/2016
- RE: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, SCHAER Frederic, 10/05/2016
- RE: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, SCHAER Frederic, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, Shawn McKee, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, Philip Papadopoulos, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result between 10gbits hosts ?, Jason Zurawski, 10/05/2016
- Re: [perfsonar-user] 10gbits perfsonar: how to interpret this iperf3 result bewteen 10gbits hosts ?, Shawn McKee, 10/05/2016
Archive powered by MHonArc 2.6.19.