perfsonar-dev - Re: [pS-dev] Lookup service & co discussion.

Subject: perfsonar development work

List archive

Re: [pS-dev] Lookup service & co discussion.

From: "Michael Bischoff" <>
To:
Cc: "" <>
Subject: Re: [pS-dev] Lookup service & co discussion.
Date: Tue, 3 Jun 2008 18:54:00 +0200 (CEST)
Importance: Normal

Jason;

> I am not sure what you mean by 'contract' here.
in the broadest sense. A defined set of rules agreed and adherent too by both
party's.

> I am not sure what you mean by 'contract' here. In general you are
> correct that a client is explicitly designed with with a service and data
> type in mind. The
> purpose of the LS API is to eliminate the need to know specifics of the
> data format because
> over time there will be *many* more types and updating the API each time a
> new service comes
> online is infeasable. Instead we rely on an implicit speficification
> relying on the eventType
> knowledge as well as the general design pattern that all metadata should
> share:
>
> ...

Am I the only one seeing the contradiction here? First you explain why it
can't be
specified(and I very much agree to that) and then it is implicitly specified.

Also this implicit specification isn't (explicitly) _documented_ anywhere.
Well I suppose now
it is, here on the mailing list which is probably one of the worst places.

> So am I correct to state that you are proposing the entire method for
> services registering to the LS change?

No not at all I want the LS to depend on the service in metadata tag and that
tag alone to do
it's work as you state and because of the reasons above that you state. If
summerisation
depends on the data tag block and a correctly working LS depends on
summerisation then the LS
depends on the data tag. By reasons above should depend upon.

ls -> summerisation/discovery -> data tag
ls -> metadata -> service tag

An obvious solution is to remove the dependency of summerisation upon the
data tag -> which
means moving the minimal amount of data in the data block to service tag.
which is what the
proposal entails. The added benefit is that you can now change the
specification from
implicit to explicit and reap the benefits that it brings.

ls -> summerisation/discovery \/
ls -> metadata -> service tag

The data tag can be used for what it was intended for and it also frees the
service/client
developer from using the bloated structure that is imposed by the flawed
approach we are
using now. because we all love bloated xml right?

> ...for the sole reason of making discovery possible....
vs
> ..because there is no guarantee that future Information services will be
> organized in this
> way...

I fail to see how I should interpret this any other way then that the
discovery/summerisation
will fail somewhere in the future, but I doubt that was the intention.

> You have misunderstood,
Yes, I have.

> The psservice:service element may have a 'predictable' format,
> but it will have very *un*-predictable content. Currently there is nothing
> that dictates
> what
> a user should enter into the service element (what forces me to enter 'MA'
> or 'MP' into a
> psservice:serviceType element?

How is this any different from any other (for example eventtypes). Poorly
configured services
won't be found when a client tries to do a lookup. Isn't it in the best
interest of person
that deploys the service to get this correct? In the current deployed
services that use the
webadmin that I have seen there is only one choice to pick. We can help by
not allowing
values that we know are wrong and clearly formally document what is expected.

> I am not proposing we get rid of this element, it still has uses,
> but it should not be relied upon other than for 'best effort' type of
> searches.

As a client/plugin developer we don't need it for search purposes but for the
step after we
found a service which we want to use. We don't need this data to be
summarised. We needed
enough information to be able to query the service. event types help and
probably fulfil the
requirement in a perfect world. If we need to deploy a workaround for a bug
for a widely
deployed service... but that is a different discussion.

> So am I correct to state that you are proposing the entire method for
> services registering to the LS change?

The method is still the same the process is most left unchanged except for
where the data is
stored and retrieved from. The queries within the ls and in the ls client
will be much
simpler which saves us time on debugging.

> Changing the entire process at this stage (on all services) would be a
> disaster, especially when I do not see any clear benefits over the current
> system. The API we
> have offered was non-destructive in that it would not require any service
> modification to be
> effective.

Not sure what Verona and everyone else is doing these last weeks because we
are all putting
in effort to ensure that services adhere to this implicit specification that
was discovered
very late in the development process by everyone else.So as far as I can see
effort is
currently already required.

Michael.

-----------

> Michael;
>
>
> The cutting/pasting has lost the much of the original context to these
> issues, so I am rather lost and I apologize if I do not address certain
> things properly.
>
>
>>> I guess event types might be the answer - but this needs to be
>>> bootstrapped somehow. Can lookup service metadata query be used for any
>>> type of service
>>> and return information in a more or less similar way? Currently, the
>>> result of a LS query
>>> is returned as *AnyElement*
>>>
>>
>> Xpath results have a predictable 'shape(structure?)'; given you know the
>> 'shape' of the
>> data it was preformed upon. And we do know this shape, since Lsstore is
>> (almost completely)
>> formalised. The only part that isn't is the data element but services can
>> formalise that
>> in there own documentation and since the 'client'(Or plugin) is based
>> on(or tied too) that
>> service doing a LookupQuery shouldn't be issue.
>>
>> ---
>>
>>
>> So this works because there is a contract between the service and the
>> client of the
>> service, however there is no contract between the LS and the client nor
>> between the LS and
>> the service.
>
>
> I am not sure what you mean by 'contract' here. In general you are
> correct that a client is explicitly designed with with a service and data
> type in mind. The
> purpose of the LS API is to eliminate the need to know specifics of the
> data format because
> over time there will be *many* more types and updating the API each time a
> new service comes
> online is infeasable. Instead we rely on an implicit speficification
> relying on the eventType
> knowledge as well as the general design pattern that all metadata should
> share:
>
> 1) Subjects with topology elements
> 2) EventType(s) to describe the data
> 3) Parameter(s) to describe anything else
>
>
> Every form of measurement data for the current array of services was
> designed to keep this structure in place for the sole reason of making
> discovery possible.
> These (and only these) assumptions should be made
> about 'blob' of XML data that is registered by a given service. Given the
> eventTypes we can
> glean which specific topology elements may reside in the subject and
> summarize over these.
> Thus far we have not (and
> really should not) made assumptions about the backend storage (e.g.
> LSStore) because there is
> no guarantee that future Information services will be organized in this way.
>
>
>>> These are excellent observations, and points out an obvious flaw in
>>> allowing deployments to define their own keyword structure. I think it
>>> is much safer to
>>> rely on data structure (eventTypes, etc.) to dictate the structure of
>>> that data rather
>>> than a set of self configured strings.
>>>
>>
>> no, we are advocating a fixed documented additional tag in the
>> <psservice:service>
>> container that a correctly configured service registers in a predictable
>> way. The keyword
>> proposal that flew by here on the mailing-list is unrelated.
>>
>
>
> You have misunderstood, this comment was not related to the 'keyword'
> proposal, but in general it brings up the same underlying issue of allowing
> users to specify
> their own searchable tags. The psservice:service element may have a
> 'predictable' format,
> but it will have very *un*-predictable content. Currently there is nothing
> that dictates what
> a user should enter into the service element (what forces me to enter 'MA'
> or 'MP' into a
> psservice:serviceType element? will
> searching on 'MA' give us everything that is really a Measurement
> Archive?). A 'suggested
> value' only gets us so far and leaves the entire discovery mechanism open
> to gaps when things
> are not configured in an expected way.
>
> By identifying absolutes (eventTypes, topology elements) we are
> guaranteed to have equal footing when searching all registered data;
> relying on service
> information for discovery is therefore bad (this is why we don't summarize
> over the service
> metadata descriptions). I am not proposing we get rid of this element, it
> still has uses,
> but it should not be relied upon other than for 'best effort' type of
> searches.
>
>
>>>> That makes sense, be can we facilitate the whole process currently?
>>>> Don't we also need to ask
>>>> (a) service(s) to identify the network topology elements? If not the
>>>> current structure of the topology itself.
>>>>
>>> The topology elements are tied to the various data schemata through
>>> eventTypes.
>>>
>>
>> not sure what you mean by this.
>>
>
>
> For example the event type:
>
>
> http://ggf.org/ns/nmwg/tools/iperf/2.0/
>
>
> Is discussed in the iperf.rnc schema. This schema explicitly limits the
> topology elements of this data type to endPointPairs (v2 or v3). The same
> idea is true for
> all other forms of data (e.g. utilization deals with interfaces, etc.).
> When we know an
> eventType, we can make a guess as to what specific topology element we are
> expecting, and
> summarize over this.
>
>
>>>> Use Case:
>>>> A client would use a topology service to look
>>>> up the identifier for a network element and then would query a lookup
>>>> service using the
>>>> identifier to find the measurements associated with that element.
>>>>
>>>> Obtaining a topology service must also be retrieved from a LS? right?
>>>>
>>>>
>>> It could be done that way, but the Topology service and Lookup service
>>> are both well known information services (and as such will probbaly be
>>> combined in the
>>> near future).
>>>
>>
>> combining them is a bad idea no matter how you slice it;
>>
>> - it increases complexity of both the topology service as well as the ls
>> unnecessary
>> - you enforce a implementation detail or have a incomplete spec (if
>> someone else does a
>> implementation the communication between the topology and Ls service is
>> not specified in a
>> perfsonar protocol.) - it's more difficult to test, debug (you can't test
>> them separately
>> for starters) - it doesn't scale
>> - The data-storage is crippled because you have to choose to optimise it
>> for LS-queries or
>> Topology service
>> - etc
>>
>>
>
>
> Since this is unrelated to the work at hand, I would suggest you spawn a
> new thread to discuss this issue if you are interested. One of the focus
> areas of the new
> NMC-WG in the OGF is to define an IS protocol
> that will unify discovery of topology, services and other resources.
>
>
>>> From Maciej's point of view 'anyText' is really all he needs to care
>>> about. There is some well formed blob that comes from a service that the
>>> LS can manage.
>>> It is up to the individual service designer to
>>> elaborate this as much as possible.
>>>
>>> -jason
>>>
>>>
>>
>> and thus it's only a contract between the service and it's client's and
>> the ls shouldn't
>> depend on it.(Esp since you don't expect anything to be enforced here by
>> the ls) Since it
>> has been established that for summerisation to work they need certain
>> values. And thus the
>> LS
>> depends on these values, it should be in that part of the schema that is
>> formally defined;
>> the service in data tag.
>>
>
>
> You have lost me here. I believe my response above covers this, but I
> am still unsure what you mean by 'contract'. The 'certain' values that we
> are expecting are
> present in all current data schemata and this is the extent of formalism
> that I would expect
> or require. Enumerating every single possible combination into an RNC is
> not really a
> constructive exercise, even so if you require a schema it would be rather
> simple:
>
> nmwg:data +
> nmwg:metadata +
> *:subject
> AnyTopologyElement
> nmwt:eventType +
> *:parameters ?
>
>
>
>> If it helps us moving forward at this point then I can live with a
>> de-facto standard that
>> is been established by one service (RRD) and needs just as much (or maybe
>> even more)
>> communication effort between us all and relatively more work for service
>> developers(the
>> proposed solution here is needs only a change in the code of the perfsonar
>> base and
>> services should specify a few parameters. The solution thats on the table
>> now needs the
>> same plus additional code in the services.
>>
>> so if not for 3.1 for 3.2/4.0 this should be rectified.
>>
>> <nmwg:store type="LSStore">
>> <nmwg:metadata id=" ..service_id...">
>> <perfsonar:subject>
>> <psservice:service id="localhost.4c922942:11a2c3a5abf:-85">
>> <psservice:serviceName>perfSONAR PIONIER RRD MA</psservice:serviceName>
>> <psservice:accessPoint> ...url...</psservice:accessPoint>
>> <psservice:serviceType>ma</psservice:serviceType>
>> <psservice:serviceDescription>perfSONAR PIONIER RRD
>> MA</psservice:serviceDescription>
>> <psservice:serviceVersion>3.0</psservice:serviceVersion>
>> <psservice:organization>PIONIER</psservice:organization>
>> <psservice:contactEmail></psservice:contactEmail>
>> <psservice:supportedEventTypes>
>> <nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
>> </psservice:supportedEventTypes>
>> <psservice:relatedTopologyElements>
>> 
>> <?:topologyElement urn="urn:ogf:network:domain=Internet2.edu" />
>> 
>> <?:topologyElement
>> urn="urn:ogf:network:domain=internet2.edu:node=packrat:port=eth0" />
>> </psservice:relatedTopologyElements>
>> </psservice:service>
>> </perfsonar:subject>
>> </nmwg:metadata>
>> <nmwg:data id=" ...id..." metadataIdRef="...id...">
>> <nmwg:metadata id="meta36">
>> ...anything by contract of the service...
>> </nmwg:metadata>
>> </nmwg:data>
>> <nmwg:store>
>>
>>
>>
>> if the service needs to know more about the topologyElement/interface it
>> should query the
>> topology service with the urn. if the client needs a service it should
>> query the ls with a
>> urn. if the client has an set of ips it should query the topology service
>> to obtain
>> information and urn's, it can then use the urn's to find services that
>> 'know' about the
>> ip's/topologyElements by querying the LS.
>>
>> Using the above structure we can _enforce_ supportedEventTypes and
>> relatedTopologyElements
>> (might need a rename) being there when the service registers. This is
>> easier for all (Ls
>> and service developers.)
>>
>> Added advantage is that because the structure is simpler the (x)queries
>> are simpler and
>> faster.
>>
>> imo metadata in data tag should contain the least amount of data to get
>> by. I'm also
>> against storing too much information in the ls about network elements
>> (unless it is a
>> workaround for the topology service not being avail) the data will get
>> outdated and might
>> conflict with other data being registered by other services storing only
>> an id will
>> promote the single point of definition principle and avoids collisions.
>> Added advantage is
>> that somewhere in the future we should be able to configure service's by
>> just specifying
>> the topology element urn and it will retrieve the rest of the data from
>> the topology
>> service. Domain renames, node renames and changing ip addresses should end
>> up requiring
>> very little effort and ensures better availability of the perfsonar tools.
>>
>
>
> So am I correct to state that you are proposing the entire method for
> services registering to the LS change? I am not sure what prompted this or
> why you feel that
> it is necessary but the current method gives us everything we need to
> perform discovery and
> query.
>
> Changing the entire process at this stage (on all services) would be a
> disaster, especially when I do not see any clear benefits over the current
> system. The API we
> have offered was non-destructive in that it would not require any service
> modification to be
> effective.
>
> -jason
>

Re: [pS-dev] Lookup service & co discussion., Michael Bischoff, 06/03/2008

List archive

Re: [pS-dev] Lookup service & co discussion.