perfsonar-dev - Re: [pS-dev] Lookup service & co discussion.

Subject: perfsonar development work

List archive

Re: [pS-dev] Lookup service & co discussion.

From: Jason Zurawski <>
To: Michael Bischoff <>
Cc: "" <>
Subject: Re: [pS-dev] Lookup service & co discussion.
Date: Tue, 03 Jun 2008 10:32:13 -0400
Openpgp: id=B94D59A6; url=http://people.internet2.edu/~zurawski/key.txt
Organization: Internet2

Michael;

The cutting/pasting has lost the much of the original context to these issues, so I am rather lost and I apologize if I do not address certain things properly.

I guess event types might be the answer - but this needs to be
bootstrapped somehow. Can lookup service metadata query be used for any type
of service and
return information in a more or less similar way? Currently, the result of a
LS query is
returned as *AnyElement*

Xpath results have a predictable 'shape(structure?)'; given you know the
'shape' of the data
it was preformed upon. And we do know this shape, since Lsstore is (almost
completely)
formalised. The only part that isn't is the data element but services can
formalise that in
there own documentation and since the 'client'(Or plugin) is based on(or tied
too) that
service doing a LookupQuery shouldn't be issue.

---

So this works because there is a contract between the service and the client
of the service,
however there is no contract between the LS and the client nor between the LS
and the
service.

I am not sure what you mean by 'contract' here. In general you are correct that a client is explicitly designed with with a service and data type in mind. The purpose of the LS API is to eliminate the need to know specifics of the data format because over time there will be *many* more types and updating the API each time a new service comes online is infeasable. Instead we rely on an implicit speficification relying on the eventType knowledge as well as the general design pattern that all metadata should share:

1) Subjects with topology elements
2) EventType(s) to describe the data
3) Parameter(s) to describe anything else

Every form of measurement data for the current array of services was designed to keep this structure in place for the sole reason of making discovery possible. These (and only these) assumptions should be made about 'blob' of XML data that is registered by a given service. Given the eventTypes we can glean which specific topology elements may reside in the subject and summarize over these. Thus far we have not (and really should not) made assumptions about the backend storage (e.g. LSStore) because there is no guarantee that future Information services will be organized in this way.

These are excellent observations, and points out an obvious flaw in
allowing deployments to define their own keyword structure. I think it
is much safer to rely on data structure (eventTypes, etc.) to dictate
the structure of that data rather than a set of self configured strings.

no, we are advocating a fixed documented additional tag in the
<psservice:service>
container that a correctly configured service registers in a predictable way.
The keyword
proposal that flew by here on the mailing-list is unrelated.

You have misunderstood, this comment was not related to the 'keyword' proposal, but in general it brings up the same underlying issue of allowing users to specify their own searchable tags. The psservice:service element may have a 'predictable' format, but it will have very *un*-predictable content. Currently there is nothing that dictates what a user should enter into the service element (what forces me to enter 'MA' or 'MP' into a psservice:serviceType element? will searching on 'MA' give us everything that is really a Measurement Archive?). A 'suggested value' only gets us so far and leaves the entire discovery mechanism open to gaps when things are not configured in an expected way.

By identifying absolutes (eventTypes, topology elements) we are guaranteed to have equal footing when searching all registered data; relying on service information for discovery is therefore bad (this is why we don't summarize over the service metadata descriptions). I am not proposing we get rid of this element, it still has uses, but it should not be relied upon other than for 'best effort' type of searches.

That makes sense, be can we facilitate the whole process currently?
Don't we also need to ask
(a) service(s) to identify the network topology elements? If not the
current structure of the
topology itself.

The topology elements are tied to the various data schemata through
eventTypes.

not sure what you mean by this.

For example the event type:

http://ggf.org/ns/nmwg/tools/iperf/2.0/

Is discussed in the iperf.rnc schema. This schema explicitly limits the topology elements of this data type to endPointPairs (v2 or v3). The same idea is true for all other forms of data (e.g. utilization deals with interfaces, etc.). When we know an eventType, we can make a guess as to what specific topology element we are expecting, and summarize over this.

Use Case:
A client would use a topology service to look
up the identifier for a network element and
then would query a lookup service using the
identifier to find the measurements
associated with that element.

Obtaining a topology service must also be retrieved from a LS? right?

It could be done that way, but the Topology service and Lookup service
are both well known information services (and as such will probbaly be
combined in the near future).

combining them is a bad idea no matter how you slice it;

- it increases complexity of both the topology service as well as the ls
unnecessary
- you enforce a implementation detail or have a incomplete spec (if someone
else does a
implementation the communication between the topology and Ls service is not
specified in a
perfsonar protocol.)
- it's more difficult to test, debug (you can't test them separately for
starters)
- it doesn't scale
- The data-storage is crippled because you have to choose to optimise it for
LS-queries or
Topology service
- etc

Since this is unrelated to the work at hand, I would suggest you spawn a new thread to discuss this issue if you are interested. One of the focus areas of the new NMC-WG in the OGF is to define an IS protocol that will unify discovery of topology, services and other resources.

From Maciej's point of view 'anyText' is really all he needs to care
about. There is some well formed blob that comes from a service that
the LS can manage. It is up to the individual service designer to
elaborate this as much as possible.

-jason

and thus it's only a contract between the service and it's client's and the
ls shouldn't
depend on it.(Esp since you don't expect anything to be enforced here by the
ls) Since it has
been established that for summerisation to work they need certain values. And
thus the LS
depends on these values, it should be in that part of the schema that is
formally defined;
the service in data tag.

You have lost me here. I believe my response above covers this, but I am still unsure what you mean by 'contract'. The 'certain' values that we are expecting are present in all current data schemata and this is the extent of formalism that I would expect or require. Enumerating every single possible combination into an RNC is not really a constructive exercise, even so if you require a schema it would be rather simple:

nmwg:data +
nmwg:metadata +
*:subject
AnyTopologyElement
nmwt:eventType +
*:parameters ?

If it helps us moving forward at this point then I can live with a de-facto
standard that is
been established by one service (RRD) and needs just as much (or maybe even
more)
communication effort between us all and relatively more work for service
developers(the
proposed solution here is needs only a change in the code of the perfsonar
base and services
should specify a few parameters. The solution thats on the table now needs
the same plus
additional code in the services.

so if not for 3.1 for 3.2/4.0 this should be rectified.

<nmwg:store type="LSStore">
<nmwg:metadata id=" ..service_id...">
<perfsonar:subject>
<psservice:service id="localhost.4c922942:11a2c3a5abf:-85">
<psservice:serviceName>perfSONAR PIONIER RRD MA</psservice:serviceName>
<psservice:accessPoint> ...url...</psservice:accessPoint>
<psservice:serviceType>ma</psservice:serviceType>
<psservice:serviceDescription>perfSONAR PIONIER RRD
MA</psservice:serviceDescription>
<psservice:serviceVersion>3.0</psservice:serviceVersion>
<psservice:organization>PIONIER</psservice:organization>

<psservice:contactEmail></psservice:contactEmail>
<psservice:supportedEventTypes>

<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
</psservice:supportedEventTypes>
<psservice:relatedTopologyElements>

<?:topologyElement urn="urn:ogf:network:domain=Internet2.edu" />

<?:topologyElement
urn="urn:ogf:network:domain=internet2.edu:node=packrat:port=eth0" />
</psservice:relatedTopologyElements>
</psservice:service>
</perfsonar:subject>
</nmwg:metadata>
<nmwg:data id=" ...id..." metadataIdRef="...id...">
<nmwg:metadata id="meta36">
...anything by contract of the service...
</nmwg:metadata>
</nmwg:data>
<nmwg:store>

if the service needs to know more about the topologyElement/interface it
should query the
topology service with the urn.
if the client needs a service it should query the ls with a urn.
if the client has an set of ips it should query the topology service to
obtain information
and urn's, it can then use the urn's to find services that 'know' about the
ip's/topologyElements by querying the LS.

Using the above structure we can _enforce_ supportedEventTypes and
relatedTopologyElements
(might need a rename) being there when the service registers. This is easier
for all (Ls and
service developers.)

Added advantage is that because the structure is simpler the (x)queries are
simpler and faster.

imo metadata in data tag should contain the least amount of data to get by.
I'm also against storing too much information in the ls about network
elements (unless it is
a workaround for the topology service not being avail) the data will get
outdated and might
conflict with other data being registered by other services storing only an
id will promote
the single point of definition principle and avoids collisions. Added
advantage is that
somewhere in the future we should be able to configure service's by just
specifying the
topology element urn and it will retrieve the rest of the data from the
topology service.
Domain renames, node renames and changing ip addresses should end up
requiring very little
effort and ensures better availability of the perfsonar tools.

So am I correct to state that you are proposing the entire method for services registering to the LS change? I am not sure what prompted this or why you feel that it is necessary but the current method gives us everything we need to perform discovery and query.
Changing the entire process at this stage (on all services) would be a disaster, especially when I do not see any clear benefits over the current system. The API we have offered was non-destructive in that it would not require any service modification to be effective.

-jason

Lookup service & co discussion., Michael Bischoff, 06/03/2008
- Re: [pS-dev] Lookup service & co discussion., Jason Zurawski, 06/03/2008

List archive

Re: [pS-dev] Lookup service & co discussion.