ndt-users - Re: Am I seeing the right results?

Subject: ndt-users list created

List archive

Re: Am I seeing the right results?

From: Peter Van Epp <>
To:
Subject: Re: Am I seeing the right results?
Date: Sat, 29 Sep 2007 11:01:10 -0700

On Sat, Sep 29, 2007 at 11:14:40AM +0200, Simon Leinen wrote:
> Peter Van Epp writes:
> > The "slowest link is gig" is likely caused by the server NIC
> > card having interrupt reduction (it has a correct name but I don't
> > remember it :-)) on.
>
> (People seem to prefer the fancier names of "interrupt coalescence" or
> "interrupt moderation" :-)
>
> http://kb.pert.geant2.net/PERTKB/InterruptCoalescence
>
> > NDT guesses link speed by packet interarrival time from the NIC if
> > it delivers multiple packets per interrupt that timing is disrupted
> > (this can usually be disabled in the NIC driver although perhaps not
> > easily).
>
> Very interesting, I hadn't known that. But how does NDT measure
> *packet* interarrival times - doesn't it only do TCP (where the
> application only sees a byte stream)?

It uses pcap to get the raw packets from the interface (including
kernel timestamps) for at least some parts of the testing.

>
> > Before I turned it off on our gig link it used to claim I had an
> > OC192 (which was of course news to me). Throughput looks about right
> > for a well performing 100 meg link though.
>
> Because interrupt coalescence is quickly becoming prevalent (even my
> laptop has it), it would be useful to think about measurement methods
> that are "robust" to it.
>
> In general, I would favour it if everybody used kernel timestamps
> (e.g. SO_TIMESTAMP), and every adapter that performs interrupt
> coalescence would decorate incoming frames with hardware timestamps.
> That wouldn't require much (if any) new hardware on the adapters, just
> a little more logic in the driver to convert hardware timestamps into
> OS-level timestamps.

It is more difficult than this. The Interrupt coalescence is built
in to the ethernet chips which have internal buffers and no access to the
kernel time. The answer is Endace DAG cards (www.endace.com) which keep an on
board time source (syncable to GPS via ntp if desired) that stamps the packet
in its internal buffer (and presumably builds the ethernet interface out of
descrete chips rather than one of the single chip solutions). They include an
onboard CPU and large packet cache (most chips have at most a 64K cache, I
believe a DAG is around 4 megs) and thus can capture correctly in the face of
bus contention as well. If you are doing disk I/O on the machine capturing
you
will likely lose packets on a conventional gig enet card due to PCI bus
contention even at around 100 megabits per second (at least that has been my
experience on argus). The disk I/O ties up the bus too long and the card
buffer
overruns and loses packets.
They are the preferred choice for measurement of all kinds but are a
little pricey, aroung $8K US for a gig and around $30K for OC192 when last
I asked (which is a while ago).

>
> Still I have no idea on how to provide such timestamps to an
> application that only uses TCP...
> --
> Simon.

As noted pcap, although at higher speeds I've been told the callback
mechanism eventually becomes a problem (around 600 megs I think) and you need
to change to something more exotic and thus less portable. I believe Endace
has a custom API for the truely speed challanged.

Peter Van Epp / Operations and Technical Support
Simon Fraser University, Burnaby, B.C. Canada

Am I seeing the right results?, Jeremy Schafer, 09/26/2007
- Re: Am I seeing the right results?, Jeremy Palmer, 09/26/2007
- Re: Am I seeing the right results?, Peter Van Epp, 09/26/2007
  - Re: Am I seeing the right results?, Simon Leinen, 09/29/2007
    - Re: Am I seeing the right results?, Peter Van Epp, 09/29/2007
      - Re: Am I seeing the right results?, Simon Leinen, 09/30/2007

List archive

Re: Am I seeing the right results?