Hi Jason,
I have copy-and-pasted some parts of the mpstat output below; it is when the bandwidth measurement is degraded to around 600Mbps, which is when it has several stateful iptables rules. You can see CPU #0 hits 0% idle, and it stays
there for several seconds (like around 7-8 seconds).
When there are no iptables rules, CPU #0 never hits 0% idle, and although it goes close to 0%, it does not stay there that long.
-Joon
03:00:50 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:51 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:51 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:51 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:51 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:52 PM all 0.50 0.00 14.43 0.00 0.00 6.47 0.00 0.00 0.00 78.61
03:00:52 PM 0 0.00 0.00 29.00 0.00 0.00 13.00 0.00 0.00 0.00 58.00
03:00:52 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:52 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:53 PM all 0.50 0.00 31.16 0.00 0.00 19.10 0.00 0.00 0.00 49.25
03:00:53 PM 0 1.01 0.00 61.62 0.00 0.00 37.37 0.00 0.00 0.00 0.00
03:00:53 PM 1 1.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.99
03:00:53 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:54 PM all 0.00 0.00 31.50 0.00 0.00 18.50 0.00 0.00 0.00 50.00
03:00:54 PM 0 0.00 0.00 62.38 0.00 0.00 37.62 0.00 0.00 0.00 0.00
03:00:54 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:54 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:55 PM all 0.00 0.00 35.18 0.00 0.00 15.08 0.00 0.00 0.00 49.75
03:00:55 PM 0 0.00 0.00 70.00 0.00 0.00 30.00 0.00 0.00 0.00 0.00
03:00:55 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:00:55 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:00:56 PM all 0.50 0.00 34.83 0.50 0.00 16.42 0.00 0.00 0.00 47.76
03:00:56 PM 0 0.00 0.00 67.68 0.00 0.00 32.32 0.00 0.00 0.00 0.00
03:00:56 PM 1 0.00 0.00 4.00 1.00 0.00 0.00 0.00 0.00 0.00 95.00
…
…
03:01:00 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:01:01 PM all 0.00 0.00 35.00 0.00 0.00 15.00 0.00 0.00 0.00 50.00
03:01:01 PM 0 0.00 0.00 70.00 0.00 0.00 30.00 0.00 0.00 0.00 0.00
03:01:01 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:01:01 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:01:02 PM all 0.50 0.00 20.00 0.50 0.00 11.50 0.00 0.00 0.00 67.50
03:01:02 PM 0 0.00 0.00 36.36 0.00 0.00 23.23 0.00 0.00 0.00 40.40
03:01:02 PM 1 0.00 0.00 3.96 0.99 0.00 0.00 0.00 0.00 0.00 95.05
On Jan 4, 2016, at 12:56 PM, Jason Zurawski <> wrote:
Hey Joon;
Another suggestion would be to look at mpstat (e.g. mpstat -P ALL 1) in the various test scenarios to see if things are getting pegged. If Mark's synopsis is correct, the security stuff is probably the bottleneck.
Thanks;
-jason
Mark Feit wrote:
I can’t confirm it for the LIVA because I don’t have one, but iptables is well-known to be a drag in high-traffic situations. You won’t notice this as much on bigger machines because they tend to have enough horsepower that the extra processing
time doesn’t matter. Your change effectively disables all of it, so the increase in throughput makes perfect sense. The tradeoff is that the machine is no longer protected from the outside.
Iptables processes the rules in each chain in the order they appear, and the perfSONAR chain isn't arranged to get time- and rate-sensitive traffic processed as quickly as possible.
—Mark
Hyojoon Kim wrote:
Hi all,
After some digging, I am pretty convinced that the under-bandwidth achievement of the LIVA X box is due to:
- "perfsonar-toolkit-security” package installation.
“lsmod" command shows that nf_conntrack-related kernel modules become loaded after the perfsonar security package installation, along with several iptables rules. After this happens, bwctl measurement will never go over 700Mbps when the test is
initiated from the LIVA X box. Flushing the iptables rules and disabling the nf_conntrack-related modules fixes the issue, restoring the bwctl measurement to 940Mbps.
Can someone confirm this?
Of course, you have to be able to remove the security package after testing. You can do:
# apt-get remove perfsonar-toolkit-security
# apt-get autoremove
# cd /etc/iptables/
# rm rules.v4 rules.v6
# reboot
If this is confirmed, maybe it makes sense to put a note at this link, in the section of “Installation Instructions”.
Or at this link, putting a note saying something like “Don’t install the security package if you have a low-cost node, like LIVA X”.
Happy holidays, everyone.
Thanks,
Joon
On Dec 30, 2015, at 10:20 AM, Hyojoon Kim <> wrote:
Hi Jason,
Sorry for the late reply; was a bit out for the holidays, and I wanted to actually replicate the problem.
- tcpdumps when the LIVA box only achieves around 513Mbps is here. Note that if the test is initiated *from* the Dell server, it does achieve 935Mbps.
- TCP settings are probably not exactly uniform. Let me know which settings you want to see, and I will provide them from "sysctl -a".
- The Dell server has CentOS.
Thanks,
Joon
On Dec 24, 2015, at 9:35 AM, Jason Zurawski <> wrote:
Hey Joon;
Interesting stuff - its a little late to weigh in, but some thoughts as I read this:
- You noted that when the Liva was the receiver (and Dell was sending, things were 'ok'. The opposite (Dell receiving, Liva sending) things degraded. Did you happen to capture any packet traces during this? I would be curious to see if there was something
funky going on that would force a slowdown (pause frames, window manipulation, etc.) either from one of the OSs or the NICs.
- Were the TCP settings uniform on both ends? Same congestion control, same socket sizes, etc?
- Was the Dell a CentOS system, or also an ubuntu?
Glad its 'working', not glad that things are this touchy :)
-jason
Daniel Doyle wrote:
Joon,
Excellent! And thanks for passing along the information. I'll have to dig a bit on this and figure out what the offending bit was, but in the meanwhile I will update the page to reflect these findings in case others are poking around and run into
the same issue.
-Dan
On Dec 22, 2015, at 6:24 PM, Hyojoon Kim <> wrote:
Just to give an update on this issue:
Now I get 940Mbps with bwctl :-)
* Side note:
I don’t know what makes the other LIVA box with Ubuntu-14.04-desktop-amd64 unable to achieve 940Mbps. I might dig into it when I have the time. Just FYI, things I did differently on the under-achieving box are:
- It’s a different OS (Ubuntu 14.04)
- I installed two more packages ("perfsonar-toolkit-security" and "perfsonar-toolkit-sysctl") in addition to the "perfsonar-testpoint”
- I did some Linux tuning after it was not able to achieve 940 Mbps.
Thanks,
Joon
On Dec 22, 2015, at 2:00 PM, Hyojoon Kim <> wrote:
Thanks for all the suggestions and comments!
To answer some:
* I did remove the "ondemand option" as I saw that note somewhere too before I ran the test. I’m sure it did something but I still get around 550Mbps bwctl.
* I did the Linux Tuning *after* I got 550Mbps at my first trial, hoping it would fix it. No luck.
* Interestingly, I get around 940Mbps when I initiate the bwctl test *from* a Dell node *to* the LIVA X box. But the other direction still gives me around 550Mbps.
* One thing I noticed is that:
- When I initiate the test from the Dell node, it opens an ephemeral port (e.g., 45250) on the dell node.
- local LIVAXBOX port 5593 connected with DELL port 45240
- From LIVAX box to Dell server is:
- local DELL port 5220 connected with LIVAXBOX port 5220
Thanks,
Joon
On Dec 22, 2015, at 1:43 PM, Uhl, George D. (GSFC-423.0)[SGT INC] <> wrote:
When I was researching the Liva X capabilities I ran across an email from Larry Blunk of Merit that he posted on the list this past October.
One thing to note is that Ubuntu enables the "ondemand" init script
by default which puts the CPU in "powersave" mode. In testing the
LIVA, I found that this seems to limit throughput a bit in performance
tests. I get around 900Mbps TCP throughput vs. 940Mbps in
"performance" mode. Also saw some packet loss when doing UDP tests
with 1500 byte datagrams at 1 Gbps in powersave mode. You can disable
the script with the following which will leave the CPU in performance
mode.
update-rc.d -f ondemand remove
-Larry Blunk
Merit
Disabling the “ondemand” mode capability might work for you. I did it and I was able to achieve 940 Mbps.
-George
From: <> on behalf of Brian Tierney <>
Reply-To: "" <>
Date: Tuesday, December 22, 2015 at 1:12 PM
To: Daniel Doyle <>
Cc: Hyojoon Kim <>, "" <>
Subject: Re: [perfsonar-user] Bandwidth measurement with LIVA X 2GB/32GB eMMC
On Tue, Dec 22, 2015 at 12:00 PM, Daniel Doyle <> wrote:
Hi Joon,
Sorry to hear about that. A couple of questions / debugging ideas:
- Are you seeing this in both directions?
- Have you tried using the same port with a different device?
- If you're using / not using jumbo frames has the device been configured accordingly?
fwiw, I did not apply any tunings out of the box on a LIVA X and got >900Mbps on a local network. It's possible some of those tunings might not be appropriate for a machine like that, but I have't dug into it.
I agree that might be the issue. The tunings on fasterdata might not be right for a small node.
See if the original default debian settings are better.
-Dan
On Dec 22, 2015, at 11:58 AM, Hyojoon Kim <> wrote:
Hi all,
We decided to play with a LIVA X 2GB/32GB eMMC node with perfSONAR 3.5. However, I am getting around 535Mbps instead of 940+Mbps when I initiate a bwctl test to another local perfSONAR node we have. Between two Dell R610 servers with perfSONAR, we do get over
950Mbps, so it’s likely not a network problem.
I have generally followed the instructions here (http://docs.perfsonar.net/install_debian.html), and installed three packages:
apt-get install perfsonar-testpoint
apt-get install perfsonar-toolkit-security
apt-get install perfsonar-toolkit-sysctl
I also did the Linux Tuning, which is indicated here:
http://fasterdata.es.net/host-tuning/linux/
Does anyone know what I am missing? I do have Ubuntu 14.04.3 Desktop OS on this. Maybe switching to the well-tested Ubuntu 12.04 is better?
===
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
$ uname -a
Linux perfbox-livaxtest-01 3.19.0-39-generic #44~14.04.1-Ubuntu SMP Wed Dec 2 10:00:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
===
Thanks,
Joon
|