shibboleth-dev - [Shib-Dev] [IdPv3] Virtualization

Subject: Shibboleth Developers

List archive

[Shib-Dev] [IdPv3] Virtualization

From: Chad La Joie <>
To:
Subject: [Shib-Dev] [IdPv3] Virtualization
Date: Wed, 18 Aug 2010 08:55:54 -0400
Organization: Itumi, LLC

Ha, fooled you in to thinking last week's email was it. Actually, it was, this isn't so much of a design email as one describing how people will accomplish something, "virtualizing the IdP", in 3.x

So, first, what do I mean by "virtualizing the IdP". To me, this basically boils down to a few things:
- How do you get one process/container/war to answer as if it were more than one IdP?
- How do you reduce the overhead of upgrading IdPs in a virtualized environment (i.e. how do you do fewer than N upgrades when you have N IdPs)?
- How do you share resource intensive bits (e.g. metadata is memory intensive) between the IdPs?

So, here's the plan for doing this.

First, instead of trying to get one IdP instance to act as a bunch of different ones, you'll simply deploy multiple IdP instances in one container. Deployments will be done via the context deployment descriptor documented on the site today.

In order to reduce the number of upgrades, you can share the actual WAR between instances. In most cases it will be one WAR for all instances, but if you need to generate a custom WAR file for some instances you can certainly do that. So upgrading N IdPs will require M upgrades where M is the number of WAR files. The assumption is that M < N, with M usually being 1.

Each deployment descriptor will allow you to bind your IdP to a unique context on a given ip:port. You may set up your container to listen on more than one ip:port giving more "slots" to which things can be bound. So, if you have one ip:port you might have IdPs bound to example.org/idp1, example.org/idp2, example.org/idp3. If you have more than one ip:port you might end up with example.org/idp1, foo.edu/idp1, foo.edu/idp2. The unique entity ID will still be set in the relying party configuration file, the deployment descriptor just ensures you have unique endpoints for all your IdPs.

Each deployment may share as little or as much of their configuration with other IdP instances by using the deployment descriptor to point to the services.xml file you wish to use. That file in turn points to all the other configuration files. So if you want to share the attribute-resolver.xml file but not the relying-party.xml and attribute-filter.xml you can do that. Note that the IdP's simplified config mechanism[1] means that deployers will likely be able to share all the configs files and only change the properties file per IdP instance.

Finally, the only resource intensive parts (that can be shared) within the IdP are the metadata providers and DB/LDAP connection pools. These will be fixed up to allow them to be bound to the container's JNDI tree and the IdP can look them up there. Most, if not all, the configuration options available via the normal IdP XML config files will be available when configuring the component for use with JNDI. The configuration syntax will be different since it's container specific. And, if you want, you can bind multiple, differently configured, instances of any given component to the JNDI tree and have some IdPs use one instance and other IdPs use another instance.

The result of this means that each newly deployed IdP will require about 5-10MB of memory plus whatever memory is required for non-shared resources (i.e. if a virtual IdP has its own special metadata provider and doesn't use the shared metadata provider than that adds to its memory requirements).

Obviously, doing this will require individuals to have a greater understanding of their containers. If deployers do not wish to gain that understanding they may trade research time against dollars and buy more hardware and each IdP just as is done today.

In addition, nothing above helps with the most resource intensive part of the IdP, the crypto operations. I am working on a way to decrease the load these operations put on the system, hopefully quite dramatically, but it's non-trivial. Therefore, this model is really only meant for a case where some group, a 3rd party hosting company for example, wants to run multiple low-load IdPs on a single piece of hardware.

[1] https://spaces.internet2.edu/display/SHIB2/IdPSimplifyConfig
--
Chad La Joie
http://itumi.biz
trusted identities, delivered

[Shib-Dev] [IdPv3] Virtualization, Chad La Joie, 08/18/2010

List archive

[Shib-Dev] [IdPv3] Virtualization