Skip to Content.
Sympa Menu

ndt-users - Re: deciphering NDT report

Subject: ndt-users list created

List archive

Re: deciphering NDT report


Chronological Thread 
  • From: Richard Carlson <>
  • To: ,
  • Subject: Re: deciphering NDT report
  • Date: Tue, 11 Apr 2006 09:04:29 -0400

Hi Maxim;

That's part of the bottleneck link detection algorithm. The NDT server quantizes the results of a packet-pair measurement technique to determine the bottleneck link info. The data below is the results of those tests.

The long answer is.
The NDT tries to point out why the test returned the reported results. One critical question is how fast can I transfer data, and that sometimes depends on the bottleneck link speed. For example, assume you have a GigE NIC and switch in your LAN, but somehow the uplink got connected to a FastE switch port. No matter how hard you try you wouldn't get over 100 Mbps. Or the test shows you got 8 Mbps, was this because you crossed an Ethernet link or for some other reason.

Given these questions I decided that the NDT server should try and determine what the path bottleneck link is. I also decided that in this environment (R&E networks) it would be uncommon to see fractional links and bonding links are all designed to keep flows on a single segment to prevent TCP reordering problems. That means the NDT server only needs to find full size links e.g., T3 (45 Mbps) or Fast E (100 Mbps).

The NDT solution is to use packet-pair timing to determine the bottleneck link type. As packets are processed by the libpcap routine they are time stamped. At the beginning of each throughput test the client opens a new TCP connection back to the server. The server then spawns a child process with a libpcap filter to look for this connections traffic. Once the filter is in place, the server signals the client to begin the test process.

Every packet, both data and Ack, for each throughput test is looked at. Each pair is used to calculate an instantaneous speed. That result is then quantized into one of 12 different link type bins (Dialup - 10 Gbps). Since TCP will try and create packet trains (multiple back-to-back packets sent in a RTT window) the chances are high that at some point the bottleneck link will be probed. This means that the NDT usually puts packets in multiple bins, and I've got some very simple algorithm to interpret these results. So the numbers you list below are the quantized bins that encode the link type.

Looking at the src/web100srv.c source code you will find my encoding structure. That 'switch' structure shows that 6=OC-12, 7=GigE, and 8 = OC-48.

There is a problem with this packet-pair measurement technique, the timestamps are put on the packet by the libpcap routines. This means that the timestamps are added when the kernel processes the packet, not when the packet enters or leaves the NIC. This can skew the results. For example the "Server Data Reports ..." usually shows a large number (OC-48 in this case). Since libpcap puts the timestamp on the packet when the kernel see the packet, what gets measured is the I/O bus speed instead of the network. So the NDT server tends to ignore these results. Another issue arises when a Gigabit Ethernet NIC uses interrupt coalescing to reduce CPU overhead. This means that packets are presented to the libpcap routine in bunches instead of when they arrive. This screws up the packet-pair timing. The solution here is to turn coalescing off.

In summary, the NDT server uses packet-pair timing techniques to determine the instantaneous speed for each pair of arriving/departing packets. The results of this test is quantized into a full link capacity bin (10, 45, 100, 622, ...). The totals for each quantized bin are then evaluated to determine the bottleneck link type. I'll also note that everything is stored in the log file (/usr/local/net/web100srv.log) so you can review what happened.

Rich

At 02:47 PM 4/10/2006,

wrote:
Hi Rich,
I am trying to figure out what NDT tells me by reporting:

Client Data reports link is ' 7', Client Acks report link is ' 7'
Server Data reports link is ' 8', Server Acks report link is ' 6'

What are the ' 7' , ' 6' ???
Thanks,
Maxim Grigoriev
Fermilab

------------------------------------



Richard A. Carlson e-mail:

Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104



Archive powered by MHonArc 2.6.16.

Top of Page