ndt-users - Re: Duplex mismatch detection not really working
Subject: ndt-users list created
List archive
- From: Richard Carlson <>
- To: ,
- Subject: Re: Duplex mismatch detection not really working
- Date: Mon, 30 Oct 2006 14:19:43 -0500
Hi J.
I'm the developer of the NDT software and I can provide you with some details about how things currently work.
First off, let me say that Duplex mismatch detection has proven to be a difficult problem to solve. There are both technical and non-technical problems that need to be overcome before the current detection signature can be improved.
The current detection signature is one that I developed after running a series of tests in a lab about 4 years ago. As reported in the source code, this signature will report a duplex mismatch if the following conditions are true.
More than 90% of the time the connection was CWND limited
The estimated bandwith was over 2 Mbps
More than 2 packets were retransmitted every second
The connection left SlowStart and entered Congestion Avoidance
The connection was idle more than 1% of the test time
and the client was not behind a DSL or Cable Modem link.
If all 6 of these conditions are true, the Duplex Mismatch message will be written.
While this is the current signature, there has been more work done in this area. This has lead to a realization that not only does a duplex mismatch cause problem, but those problems are asymmetric. This means that the detection signature depends on both the mismatch direction (switch & host settings) and on the data flow direction. The conclusion of this is that a revised mismatch detection signature must take these conditions into account and at least 2 different detection signatures must be used to identify both mismatch directions.
Internet2 is currently trying to resolve some non-technical issues that prevent us from updating this detection signature. Once these issues are resolved, I expect to release a new version that will perform better over a wide range of conditions.
In the mean time, there are a couple of things you can manually do.
1) look at the measured throughtput in both directions. A duplex mismatch will cause an asymmetric throughput. So will a normally operating Cable Modem connection, but you should be able to manually determine the difference.
2) Look at the number of timeout and retransmissions as reported on the "More Details" page. If the mismatch condition results from the client=full & switch=half, state then ACK packets will be lost. This will usually result in the server being starved for ACKS, leading to a timeout and retransmission. A connection that spends most of its time in the idle state (as reported on the Statistics page) should be investigated for a mismatch condition. If the mismatch condition results from the client=half & switch=full state, then data packets will be lost. The NDT server will receive a large number of duplicate ACK's causing the server to Fast retransmit the missing segment. The successful reception will case a 'jump' in the receive window as the missing holes are filled in.
3) In both cases the connection will usually be congestion window limited (CurCWND a small number of segments). This is due to the interaction between TCP/IP packets and Ethernet frames. A collision will only occur if both host & switch have a frame ready to send when the previous frame ends. This typically means that 3-5 back-to-back packets must be sent into the network. Once the loss is detected, the TCP congestion avoidance procedures will take over, reducing the CWND value and limiting the number of packets the server can send. This limit will be slowly raised until another collision occurs and a loss is detected.
In conclusion, the current Duplex Mismatch detection signature is known to be sub-standard. It should work OK in a campus network environment and it should detect, and report, mismatches when the switch=full and the client=half (a state that will occur if the switch is nailed down to full duplex and the client is left in auto-negotiate state). Better detection signatures have been developed, but there are non-technical issues preventing them from being deployed. Manually looking at the NDT results, stored on the server in the web100srv.log file or emailed from the client can be used in the interim. In your test case, try reversing the mismatch condition, there is a better probabliity of the NDT server correctly detecting that mismatch condition.
I hope this helps.
Regards
Rich
At 03:32 PM 10/26/2006,
wrote:
I have a question about the duplex mismatch detection in NDT. I've yet to see it ever detect a real duplex mismatch. I've tried creating a duplex mismatch on purpose by setting my laptop's eth port to 100/full and leaving the switch port at Auto (resulting in 100/Half on the switch). In this config, a NDT test will report "Packet queuing detected" but nothing about a duplex mismatch, even though I know there is one there. The same thing happens when a customer with a duplex mismatch runs a NDT test.
We are a large colocation/hosting provider and one of the common issues we have here is duplex mismatches between our switches and the customer's equipment. One of the main reasons we installed a NDT server was to help customers troubleshoot duplex mismatch issues themselves.
Can anyone shed a little light on exactly how NDT detects duplex mismatches and if there is anything I can do to make the NDT server detect duplex mismatches? My NDT server is here:
http://ndt.viawest.net/
I'm currently running NDT 3.3.19 (I was previously running ndt-3.3.12, seemed to have the same issue). Other info that might be of use:
Web100 kernel patch version: web100-2.5.11
Web100 userland: 1.5
Kernel version: 2.6.17.11-web100
Linux distro: Fedora Core release 5
100Mb Full Duplex Ethernet connection from the server to switch
GigE datacenter backbone
------------------------------------
Richard A. Carlson e-mail:
Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104
- Duplex mismatch detection not really working, jpalmer, 10/26/2006
- Re: Duplex mismatch detection not really working, Richard Carlson, 10/30/2006
Archive powered by MHonArc 2.6.16.