Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] a shellshocked experience

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] a shellshocked experience


Chronological Thread 
  • From: Stefan Piperov <>
  • To: "" <>
  • Subject: Re: [perfsonar-user] a shellshocked experience
  • Date: Thu, 2 Oct 2014 09:49:06 -0400


Jason, indeed Brown is running central perfSONAR servers. We installed an additional, dedicated one, on our network segment, because we wanted the ability to test the throughput down to the very last point before it reaches our Tier3 machines.

I guess the question that I am asking - and probably many others are - is why is perfSONAR so much more vulnerable in its default installation than our other servers, given that they are based basicaly on the same operating system? Shouldn't the defaults - like the ones for httpd you have currently described on the homepage of the project - be tightened up to the max, and then having local admins releave them as needed?
Because most of us here - the large majority of these 700 users that you quote - approach perfSONAR as users, and not as developers of any kind.
We know how to administer systems, and we can tell a problematic one when we see one. I'll put it this way: In its current state and configuration, my perfSONAR server requires a disproportionate amount of effort to maintain secure.

Perhaps the project should aim at releasing a (probably scaled down?) production version, which can be run in a typical low-maintenance mode as we are used to see from other SL/OSG-based packages, and separate the new and fancy features in a development version?

Regards,
Stefan.


On Thu, 2 Oct 2014, Jason Zurawski wrote:

Hi Stefan;

The argument cannot be boiled down to 'will it cause people to flee if it is
hard', it comes down to putting out the best quality product that we can.
'Product' here is a combination of the software and the BCP that goes with
it, and when we are wrong we need to own up to it - the appliance mentality
is wrong. We cannot guarantee the 'appliance' use case - it does a
disservice to our product and our customers to pretend we can. We are not a
24/7/365 effort. We have about 20 core people, devoting anywhere from 1/64th
to all to all of their time, and we are split between many timezones.
Appliances require almost centralized control to deliver on their promises,
its just not something a pokey open source project can do without significant
people and funding.

We can continue to be responsive to user suggestions, complaints, questions,
and praise - we are a long way from the 1st release of this tool and there
has been significant upward progress. As you note, It is a tool like a
hammer, you use it to solve a task. There are many forms of hammers - lets
go back to imagining the manual version, not the hydraulically operated kind.

One my of my previous notes made reference to 'sysadmin 101'. What follows
is not meant to be an elitist statement: not everyone can be a sysadmin. The
world needs scientists, sysadmins, and people to set up pins in a bowling
alley. If maintenance of the machine is too hard for one person to manage -
finding someone else who can assist is the best course of action. Someone
who knows how to be a sysadmin can set up automated monitoring and
maintenance routines - this is their job after all. They don't log on to the
machine 2 times a day to apply patches, that would be ludicrous. They do
watch security lists, react when needed, and trust that automated methods are
doing what they need to be doing. There needs to be a better range of care
between 'never patch since its an appliance' and 'patch hourly', and it is
possible to strike a balance.

The proper answer, that I am going to read into from your mail below, is that
you need to sit down and talk with your network security and systems
administration people about reaching a better way to handle the care and
feeding of this machine.

- Perhaps the answer is 'you shouldn't use it', and maybe that is best. If
we look at other servers on a campus (DNS, Mail, OSG Compute nodes) they all
have an administrator that need to grease the gears from time to time. If
the perfSONAR node is treated in a different manner because the word
'appliance' was associated with it, that needs to change toward a model where
its a server like the others. The number of users of perfSONAR is not as
important to us as the value people see in using it along with the ability
they possess to maintain it.

- Perhaps the answer will be 'this machine increases the value of CMS science
- so let us help you maintain it'. Cyberinfrastructure matters to everyone,
not just one overworked physicist. If there is not campus/laboratory buy-in,
things are going to fail. It may be the case your local support staff (who
really are helpful people - science and networking do not need to have an
adversarial relationship) can assist. Maybe it means integrating the machine
into a CFEngine/Puppet/etc. system. Maybe it means moving it out of your
facility to the border of campus. Maybe it means reducing the number of
distributed machines from many to 1 - there are lots of answers. I don't
have these, your local support staff will.

The networking staff at Brown U are very reasonable, and I have worked with
them (and you) on several issues over the years. If you would like someone
from the perfSONAR project to help you make the case for this tool, we would
be happy to do so (and that goes for anyone else that is reading this - all
700 of you). We make a crappy 'appliance', which is true, we do however want
to try to make a useful tool that doesn't ruin someone's Friday.

I hope this helps you and others in a similar situation;

Thanks;

-jason

On Oct 2, 2014, at 8:19 AM, Stefan Piperov
<>
wrote:

On Wed, 1 Oct 2014, Jason Zurawski wrote:

- As a P.S. to the previous bullet - perfSONAR is not an appliance...

Jason, are you not afraid that a statement like this will push away many
users? In our collaboration (CMS) perfSONAR was recommended as a _tool_ for
diagnosing network problems. A (good) tool normally does not require
maintenance or babysitting.

If I am to spend any significant amount of time (like patching the system
twice per day, as suggested in one of the responses), I'd rather not use
perfSONAR at all.

My server has been hacked already twice, which puts me in a really bad
situation in relation to the network security people.
Plus the machine was found 'frozen' a couple of more times during this last
year that I have uesd perfSONAR.
Quite disappointing.

Regards,
Stefan.




Archive powered by MHonArc 2.6.16.

Top of Page