shibboleth-dev - RE: shibd memory usage

Subject: Shibboleth Developers

List archive

RE: shibd memory usage

From: "Scott Cantor" <>
To: <>
Subject: RE: shibd memory usage
Date: Mon, 8 Aug 2005 20:16:02 -0400
Organization: The Ohio State University

A fairly detailed response...

> My slow end-to-end login response times were due to SP factors.
>
> First I was running a cgi, which took a surprizing amount
> of time to get started on a loaded system.

I've done POST->SP->Attribute Query->Resource testing replaying a single
assertion (with checkReplay off) and I can sustain 20-30 threads running
concurrently with around 1-2 second response time, which isn't much worse
than any I've done before. But there are some things I added that probably
need some tuning.

> Second though, the shibd daemon grows very large under a load (30-40
> simultaneous logins). Possibly due to memory caching of sessions?

I would think that's most of it. Any memory leaks left now should be pretty
minor.

> At start it is about 7MB.
> After 900 session creations it is 65MB
> After 1800 it is 110MB
> After 2700 it is 140MB

That works out to something like 40-50k a session, which sounds large, but
is about what I'd expect at this point. The general approach is definitely
to trade memory for speed. Caching the XML is faster and saves some
per-request cycles, but it's not space-efficient.

Each session tracks to a CCacheEntry object which has:

string m_id;
string m_application_id;
string m_provider_id;
string m_clientAddress;
time_t m_sessionCreated;
time_t m_responseCreated;
mutable time_t m_lastAccess;
time_t m_lastRetry;

ShibProfile m_profile;
SAMLAuthenticationStatement* m_auth_statement;
SAMLResponse* m_response_pre;
SAMLResponse* m_response_post;
InternalCCache *m_cache;

log4cpp::Category* log;
Mutex* m_lock;

That looks like a lot, but the bulk of it is in those SAML objects near the
end. The part that's worse now (and that I think needs tuning) is that the
attribute Response is cloned so that it can be filtered but the original
left intact in case any of the assertions were signed. That doubles the
attribute caching cost in this version.

Unsigned responses can easily run several kilobytes or more depending on
attributes, so it adds up pretty easily.

Long story short, it doesn't sound out of hand to me (in terms of your
numbers), and it wasn't all that much better before, but it was somewhat
better. However, I'm not saying it's great or even acceptable, it needs some
work. This version was a feature/redesign release and I need to tune it some
more.

> I tried this first with the default memory session lifetimes,
> from the 1.3 install. Also tried with short sessions (15 sec)
> and a 10sec cleanup interval. After 480 logins it
> was 67MB.

Of course, you wouldn't expect it to return to the original size, heap
allocators don't work like that, and Xerces does a ton of block reuse, so it
doesn't always return blocks. However, I'd be concerned (of course) if it
keeps growing boundlessly even while it's purging old data routinely.

When I test, I usually run until the process takes up 800MB or more of swap
and let it purge. Then I do another test and make sure it reuses a lot of
the allocated RAM that it hasn't returned to the OS. I try and free things
as much as I can in the parser that technically would get re-used just to
try and return memory, but I don't know that it helps much.

I'm sure using a custom allocator in the parser would help, but that's a lot
of work, and it would be slower if the goal was to return more memory to the
OS.

> Does a loaded SP system have to use the mysql cache to avoid
> this memory usage? Or is there a more efficient setting
> for the memory cache?

It can only get more efficient by removing things it's storing (or avoiding
copies when possible or if told to). I can work on that a bit, I suspect it
will require creating multiple caches, increasing the baseline footprint,
but optimizing out some of what's stored in each session.

The MySQL cache lets you stage things into and out of RAM based on how it's
tuned, so you can reduce the memory footprint a lot if you flush the memory
cache frequently and rely on reloading things from disk. The cache now can
store attributes, which avoids a re-query when something gets read back in
from disk.

However, I can't sit here and say I think it's a great piece of work. MySQL
apparently does not support session caches very well. The tables corrupt
themselves for no reason (do a google on "MySQL session cache corrupt" if
you want to laugh). I actually have code in there now that tries to detect
that and does a real-time REPAIR call, which in my testing seems to work,
but I don't know how well it would behave on a heavily loaded server.

-- Scott

shibd memory usage, Jim Fox, 08/08/2005
- RE: shibd memory usage, Scott Cantor, 08/08/2005
  - RE: shibd memory usage, Scott Cantor, 08/08/2005

List archive

RE: shibd memory usage