perfsonar-user - Re: [perfsonar-user] a shellshocked experience

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] a shellshocked experience

From: John-Paul Robinson <>
To: <>
Subject: Re: [perfsonar-user] a shellshocked experience
Date: Thu, 2 Oct 2014 10:16:31 -0500

Jason and others,

Thanks for the very helpful replies and links. I had missed the
-announce list but am on it now. I'm also happy about the forthcoming
yum cron default on. The suggestion about multiple updates per day was
more about when the cron job runs rather than some manual checking. We
use puppet for much of the admin here and may incorporate that for
perfsonar in the future.

Let me clarify that perfsonar is an invaluable resource to our effort to
assess networking and build a science dmz. I couldn't do without it.

Regarding instances in the orphanage, I too keep only a light overview
on my system. It works. I'm happy. I trust the platform and the
community building it. I don't expect perfection or that nothing will
fail. I accept that from time to time things will need fixing or may be
even re-installing. I'm ok with that.

I'd rather take a more cloud-like or scale-up perspective on the
platform itself. I want cattle not pets. In other words, I'd rather
flush it down the toilet and get another one when it fails than worry
about maintaining the one precious instance I have now. I don't have
much time to dedicate to care and feeding. One of the functions of
personar is a sensor. I'd like it if i could just throw a failed sensor
out and install a new one.

What I don't want is to lose my telemetry data. I have been running a
regular batch of tests for several months to gather a performance
profile from various points of interest. I look at those throughput
reports regularly. They are building a narrative for what we have and
what we need.

I would like it If there were a way to secure my history of the data
collected so that i can either move it forward when i reinstall or view
it else where. I don't want a failed sensor to threaten the life of my
data. The rebuild recommendations I've seen so far don't appear to
protect my data. That's why it was worth the hours I spent verifying my
existing platform's integrity.

It would be very helpful to have a way to preserve test history data off
the platform so that I can look at it with the same interface elsewhere
(another box or a future box).

Again, I'm very happy with personar. I've found it very reliable. Then
again, I may just be benefiting from others pain, so thanks to those who
have suffered.

Keep up the good work and thanks for making network performance data
gathering so much easier than it was in the past.

~jpr

On 10/01/2014 06:17 PM, Jason Zurawski wrote:
> Hey John-Paul;
>
> To echo Brian, thanks for your thoughtful note. One quick point, that some
> may not be aware of if they have been on this list for 7 years, is that
> there is a low volume (lower than the user list at least…) announcement
> list:
>
> https://lists.internet2.edu/sympa/subscribe/perfsonar-announce
>
> I can also use this opportunity summarize some of the actions we will take
> based on this, and some other internal discussions occurring of the prior
> days:
>
> - As Brian noted, a sensible default method for automatic updates is
> coming. This will not be a panacea for security or maintenance of course -
> and some of our development team has grave concerns about lulling anyone
> int a false sense of security and making things far far worse then next
> time a piece of systems software outside of our control barfs. The bottom
> line is still that each site is responsible for care, feeding, and sensible
> sysadmin practices, and we feel that's a statement everyone agrees with.
> We do (and will continue to) assist where we can with automated software to
> observe and upgrade (IDSs, yum, etc.), and we will also augment any
> technical solution with a 'sysadmin 101' guide for those that need it.
> This is in-progress, and will also be something we would encourage
> community input with over time.
>
> - As a P.S. to the previous bullet - perfSONAR is not an appliance, and
> 2014 was the year that really made everyone (community members,
> stakeholders, developers) reconsider what it should be. Recently someone
> noted that the word 'appliance' strikes images of a box with no seems, that
> may not have serviceable parts inside. Linux, unless it has been
> *drastically* altered, doesn't fit this definition. As Heartbleed,
> Shellshock, and other CVEs have shown, no matter how much we work to make
> something 'easy', it is not invincible, and admitting a failure is part of
> the road to recovery. We want to change the perception of the orphaned pS
> box, as it was the orphaned box that was pegged and owned as an easy target
> last Friday morning all around the world.
>
> - We intend to document what a 'normal' list of running toolkit processes
> look like, to prevent people from having that feeling of panic when they
> see a strange perl script running (correctly, or maliciously). This is a
> very obvious thing that many of us wouldn't have thought about.
>
> - Some products (e.g. Cacti, JOWAMP) will be removed to reduce (not
> eliminate) the risk footprint, and those that choose to use them will have
> to make a conscious choice to download and install the tools.
>
> We do appreciate the communities understanding, feedback, and patience this
> year. Please continue to offer feedback as you all see fit, either to the
> user's list or to the developers directly
> ().
>
>
> Thanks;
>
> -jason
>
> On Oct 1, 2014, at 5:14 PM, John-Paul Robinson
> <>
> wrote:
>
>> To other shellshocked perfsonar users:
>>
>> Our perfsonar node did not have automatic yum updates enabled and was
>> impacted by a shellshock-related exploit on Sept 26. This is was after
>> both bash updates had been announced on the perfsonar-user lists, so we
>> may have survived had automatic updates been enabled by default.
>>
>> • Lesson learned: run automatic updates.
>> • Recommendation: It might benefit users to have it default to on in
>> the perfsonar distribution. Also it would be good if updates were checked
>> for more than once a day. In our case we would have missed the update
>> mid-day on Sept 26 and may still have gotten exploited before the next run
>> at 4:00 am on Sept 27. Additionally a perfsonar-announce list might be
>> useful for hearing stuff even when you have -user discussions turned off.
>>
>> After receiving a local exploit report I went to check on the machine and
>> immediately noticed Apache had restarted. Alarmingly, a root-owned
>> process called fakewww also started at the same start time and oddly so
>> did one named web100srv. Both of these processes had open ports and logs
>> open. Yikes, they got root! Killed them. But then they came back after
>> I restarted httpd, even after `rpm -V httpd` showed no corruption. Oh no!
>> They've really gotten a hold of the system.
>> • Lesson learned: not all unfamiliar processes are bad. I later
>> figured out that these are part of the ndt-server rpm and normal parts of
>> perfsonar.
>> • Recommendation: rename fakewww to something meaningful and less
>> scary to the uninitiated. ndtwwwhelper might be just as good.
>>
>> Because of the potential for root exploits I ran rpm verifies of core
>> commands (eg rpm -V procps) some were good some reported prelink
>> inconsistencies. This caused some concerns at first but as I narrowed
>> down the exploit it became clear the problems were only due to a prelink
>> bug.
>>
>> https://access.redhat.com/solutions/25215
>> https://bugzilla.redhat.com/show_bug.cgi?id=204448
>>
>> • Lesson learned: other bugs can make things seem worse than they are.
>> • Recommendation: look up unfamiliar errors before you panic.
>>
>> Looking further into the state of the system I noticed an '/usr/sbin/sshd
>> -i' process running as apache and an time-wise unrelated httpd process.
>> lsof showed these were both perl codes running out of /var/tmp/ with
>> established tcp connections off site. Very suspicious and killed them.
>>
>> • Lesson learned: some processes are really bad. The abrtd logged
>> the event of the first entry into the system via apache and showed the
>> command vector was bash. This is a very helpful log to determine
>> important time lines.
>> • Recommendation: keep your system up to date.
>>
>> In the end, I traced the exploit down to the two suspicious perl processes
>> (/var/tmp/x). They were executing an ircbot as apache. There was no root
>> access to the system and simply clearing out the installed bots from
>> /var/tmp was a sufficient remedy. There was an attempted install of code
>> to exploit CVE-2013-2094 but thank fully that's a 3.8 kernel bug and
>> perfsonar is still on 2.6.
>>
>> I hope this experience can be useful to others and that the
>> recommendations can be incorporated into future releases as warranted.
>>
>> ~jpr

Re: [perfsonar-user] a shellshocked experience, (continued)

List archive

Re: [perfsonar-user] a shellshocked experience