Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] how to identify the currently-running test?

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] how to identify the currently-running test?


Chronological Thread 
  • From: Matthew J Zekauskas <>
  • To: Pete Siemsen <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] how to identify the currently-running test?
  • Date: Wed, 12 Oct 2016 15:02:13 -0400
  • Authentication-results: internet2.edu; dkim=none (message not signed) header.d=none;internet2.edu; dmarc=none action=none header.from=internet2.edu;
  • Ironport-phdr: 9a23:RhW/8hcoFZ0yFHFIIxRuGZh7lGMj4u6mDksu8pMizoh2WeGdxc66ZB7h7PlgxGXEQZ/co6odzbGJ4+a9AidZvN6oizMrTt9lb1c9k8IYnggtUoauKHbQC7rUVRE8B9lIT1R//nu2YgB/Ecf6YEDO8DXptWZBUj22Dwd+J/z0F4jOlIz3krnqo9yAKzlP0QKwfb46FxS7qB7csIFCmopvL708zDPUqXJSPela2DU7C0iUmkPZ79us/JN8uwYYif8i/tNbUqzhN/A9RKBEJDUgL20v4sD371/OQRbZtShUaXkfjhcdW1uN1xr9RJqk93Ki7uc=
  • Spamdiagnosticoutput: 1:0

I think another possibility is a throughput test where the source is 10G and this host is 1G, and the interface is simply being overrun.  Or any case where the source is larger than the destination, but none of the intermediate points restrict the traffic in any way.

Or, perhaps, this is a fiber connection and one end has some dirt or other imperfection, and you notice drops when load is high.   Although then I'm not sure they show up as drops...  may just be errors.

I'm not sure how to answer the specific question, though (precisely which test is running at a given point).

--Matt

On 10/12/16 2:44 PM, Eli Dart wrote:
Hi Pete,

Is flow control enabled between the host and the router?  One possible cause for this circumstance is the host being unable to sink traffic, and sending pause frames to the router (which it should), and the router then filling its queue and dropping packets.

Another possibility would be if you were running owamp and throughput tests on the same host. If the throughput test and the owamp test enter the router via different interfaces (or a common faster interface), then the traffic for both tests might not fit into the interface connected to the host - that could also cause output drops.

Do you feel that either of these circumstances are a good fit for the current config?

Thanks,

Eli

On Wednesday, October 12, 2016, Pete Siemsen <> wrote:
We have a perfSONAR server that is directly connected to a router. The router reports that the "Output drops" counter is going up on the interface that connects to the perfSONAR server. We graph output drops on all interfaces, so I have a nice graph that shows that the drops occur in regular spurts. So, some test that runs regularly causes the drops. Like, every ~20 minutes, there's a spurt of 2000 output drops.

There are 18 tests defined on the server. I want to identify the one that is causing the drops.

I can monitor the interface on the router, and see the "output drops" counter bump up once in a while. I could wait for a spurt, then identify the test that is running "right now". To identify the test running right now, I could do a "top" command and see the "iperf3" or "java" process that is consuming the most CPU. I also see several bwctl processes, each of which shows the remote site name/IP.

This is almost enough, but I feel like I don't have my mind right :-) What's the clueful way to approach this problem?

-- Pete



--
Eli Dart, Network Engineer                          NOC: (510) 486-7600
ESnet Science Engagement Group                           (800) 333-7638
Lawrence Berkeley National Laboratory 





Archive powered by MHonArc 2.6.19.

Top of Page