ndt-users - Re: using "analyze"

Subject: ndt-users list created

List archive

Re: using "analyze"

From: Richard Carlson <>
To: ,
Subject: Re: using "analyze"
Date: Wed, 13 Sep 2006 14:35:36 -0400

Hi Michael;

Well, the analyze program is something I wrote for myself to help wade through the log files so I'm not sure if there is anything that would be of interest to the typical sys-admin. I included it in the distribution because it was written and maybe somebody would find it useful. I should also note that I tend to put stuff in and never remove it so old cruff can get left lying around.

To answer your specific points.
At 01:55 PM 9/13/2006,

wrote:

I don't seem to see any support documents for analyze. Specifically, something that would help to understand:

Speed-chk says {5, 5, 5, 5}

The bottleneck link detection scheme uses packet-pair dynamics on both of the TCP speed tests. A child process is forked off to handle this task and the results are quantized and encoded in a number (-1 to 12). The number 5 translates into a Fast Ethernet bottleneck link. The 4 numbers are the detection guess for C2S-Data flow, C2S-Ack's, S2C-Data flow, and S2C Ack's.

Running average = {237.9, 77.5, 98.3, 99.1}
Long tail = {0, 0, 1, 1}
Distribution = {1, 1, 0, 0}

In addition to looking at the packet pairs, I've experimented with several other methods to see if there is a signature for various bottleneck links. The running average is a weighted average for the 4 packet-pair tests (both the forward data flow and the reverse ACK flow for each speed test). The tail and distribution numbers are meant to look at the statistical properties of how the quantized bins are being filled.

time spent {r=3.7% c=38.4% s=57.9%} Is r = receiver (client), c = congestion

and s = server?

Buffers = {r=65330, c=98690, s=16777216}
Transitions/sec = {r=24.2, c=88.0, s=112.2}

All of this comes from the web100 triage counters. These counters keep track of how much time the connection spends in receiver window limited (r), sender window limited (s), and congestion window limited (c). All of these variables are captured when the NDT server is the source (sender). The web100 counters track time and number of bytes sent in that state. I can add the 3 time fields and come up with percentages.

The transitions/second say how fast the connection is changing state. Again, it is normalized to a unit of time

The Buffer values are the TCP RwinRcvd, CWND, and SndBuf values. In this case the client has a 64 KB window, the server has 16 MB and the Congestion window maxed out at 96 KB.

Timeouts/sec = 1.9

Again the Web100 variables track how many times the server entered a timeout state. This tells me it did this 19 times (over a 10 second test). The Statistics page also tells you the percentage of time the connection was idle due to timeouts.

Thanks for your help. Can't tell you how much we use this tool.

So I created and use the analyze program to post process the log files to see if my detection signatures need tweaking. Most of the time you wouldn't need to bother with this, but if you do, or you want to experiment with other signatures then the analyze program can make it easier to wade through the log files.

That said, I'll attempt to write some documentation about this tool, but its not a high priority for me right now.

I hope this helps.
Rich

------------------------------------

Richard A. Carlson e-mail:

Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104

using "analyze", michael . kidwell, 09/13/2006
- Re: using "analyze", Richard Carlson, 09/13/2006

List archive

Re: using "analyze"