perfsonar-user - [perfsonar-user] 100G testing configuration
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "ELGER, NATHAN" <>
- To: "" <>
- Subject: [perfsonar-user] 100G testing configuration
- Date: Tue, 16 Jun 2020 18:23:01 +0000
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mailbox.sc.edu; dmarc=pass action=none header.from=mailbox.sc.edu; dkim=pass header.d=mailbox.sc.edu; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5FXk0aIFyfmzae66dsHE9Iy+uZKUV1N1HIRh7DqaeYM=; b=PbnwtzjGnva6c3P6jxjsBm8Ki1ZPhygR1GcUVaKarWR6e6sdDn6gOXTyiUIDahT2yWB+r2QFoPhwH32DzyUUUXLXx3f0q36wl8BVsfp6017+KUoIrrm8ViGjoRGfULS39Lp1edaEexK3ZPamh66NmBBKAjmYXR5sZMuRyuLpMcK5jgflJWCPY4sqe2I6nXiXSAZtXaAC3UZBtQHwmER6akJCouakSPEEyxI8ih1TUDQV7h1uo7bxglrkZOOa+aa9i00TCvMfRCqe6cRLxwPGUvkDGI3ZwWEnjspuq0Yje33X8wQtmFh6viekk73IK/gquKIWB2A3uRio8EavLMJ0aQ==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SqcmDkxA1QfMLgyvNdeuwg4GHnKBq6c1+tVy7g/HEYO11Ajfs/W0h2FRjXuDvJVv/XJ0tWtKZEvOCs2cgZtC2RD9I4KBr90ykUQkGkG0pTTBfth0MOOjciPri8IXEWDBVlxwW6S4+Owd2HykThef+yaMWzz52owIf9pzSt11bDI3qG749G8hCxgu9QFxg/XbtiTiKmvCvEirNrqn7fu9U3yEwn0Xm96KcVLNFnpU0C6+IaW7a5a1yQAgc0kCJStPujcObJARkGk980fyH0hdG4U31D7RG0WwmbG7NEh7yuwv+2+duEf+MCesX+6p64tXp/dmazgVFfLG4xOM5HI9FQ==
Hello there,
We've recently purchased a few 100G hosts to use for perfsonar testing across our campus 100G network as well as external perfsonar hosts in the R&E space. These are high spec Dell machines using mellanox connectX-6 adapters configured for ethernet. I have
loaded the latest perfsonar images and configured the servers with the recommended esnet tuning parameters for 100g testing. I have also applied bios tuning for HPC workloads as recommended by Dell, and updated the kernel to change the congestion control algorithm
to BBR
My issue is that with any scheduled throughput tests to 100G hosts, such as TACC, we are only seeing around 20gbit forward with very asymmetrical speeds back to us- around 5gbit. Some tests to 40gbit hosts at other institutions have even worse asymmetry- a
test to gatech has about 200mbit back to us and we should both be going over I2. Even with a scheduled throughput test between my 2 perfsonar hosts in the same switch, I only see ~15gbit between them. I have scripted 8 parallel iperf3 tests and although there
are a lot of retransmissions, I do see a total of around 50-60gbit for the testing period of 30 seconds.
Are these expected numbers for a 100G test over pscheduler? Is there some configuration I might be missing, or a better method for testing external 100G hosts? Our previous perfsonar machines were 10gbit, just to log network health between our HPC clusters
and a few regional hosts, so I am somewhat a novice at really pushing this 100G network.
I really appreciate any pointers!
Nathan Elger
UofSC Research Computing
HPC systems architect
1301 Gervais St. ste 750
- [perfsonar-user] 100G testing configuration, ELGER, NATHAN, 06/16/2020
- Re: [perfsonar-user] 100G testing configuration, Curtis, Bruce, 06/19/2020
- <Possible follow-up(s)>
- Re: [perfsonar-user] 100G testing configuration, Mark Feit, 06/16/2020
Archive powered by MHonArc 2.6.19.