perfsonar-dev - Re: [pS-dev] Re: data/metadata relationships

Subject: perfsonar development work

List archive

Re: [pS-dev] Re: data/metadata relationships

From: "Vedrin Jeliazkov" <>
To: "Jeff W. Boote" <>, "Maciej Glowiak" <>
Cc: <>
Subject: Re: [pS-dev] Re: data/metadata relationships
Date: Mon, 07 Aug 2006 19:12:13 +0300

Hello Jeff, Maciej, All,

"Jeff W. Boote"
<>
wrote:

<snip>

> The point is that I am concerned that this situation can cause confusion for

> clients. I asked everyone - most especially client developers - if having
> multiple references to the same information was confusing.

We haven't encountered so far problems with multiple references to the same
information. However, we do have a problem with missing explicit relations
between somehow related data and that's why I'm using the opportunity to
discuss it. Let me use an example with interface utilization in order to
illustrate this problem. In this case for a given interface we receive two
unrelated XML constructs for ingress and egress directions. There is no any
explicit indication that they correspond to the same interface. Currently we
deduce this grouping info by comparing interface attributes (like hostName,
ifName, ifDescription, etc...), but this is not fail safe, in particular when
those attributes are missing or their combinations are not unique for a given
interface. Inferring the relation from file names doesn't work either, because
in some cases the related info is stored in different files (or database
fields). We've also considered sending aggregate requests for different
metrics of a given endpoint, but this solution would impose some undesirable
constraints on the messaging and is not generic enough. Of course, the problem
could be avoided in the case of utilization RRD MAs by making sure that the
configured combinations of interface attributes are unique, but we feel that
this might not be the best approach.

Here is an example SetupDataResponse, illustrating the problem:

2006-08-07 17:43:07,390 [Thread-2] DEBUG org.perfsonar.client.ma.MARequest2 -
<?xml version="1.0" encoding="UTF-8"?>
<nmwg:message xmlns:nmwg="http://ggf.org/ns/nmwg/base/2.0/";
id="localhost.-5e1feaef:10ce6c59f5e:-412d">
<nmwg:metadata id="meta1">
<netutil:subject
xmlns:netutil="http://ggf.org/ns/nmwg/characteristic/utilization/2.0/";
id="subj1">
<nmwgt:interface xmlns:nmwgt="http://ggf.org/ns/nmwg/topology/2.0/";>
<nmwgt:hostName>PoP-SOF</nmwgt:hostName>
<nmwgt:ifName>Fa0/0</nmwgt:ifName>
<nmwgt:ifDescription>SEEREN-SOF==ISTF-SOF(1)</nmwgt:ifDescription>
<nmwgt:ifAddress type="ipv4">194.141.252.2</nmwgt:ifAddress>
<nmwgt:direction>out</nmwgt:direction>
<nmwgt:capacity>100000000</nmwgt:capacity>
</nmwgt:interface>
</netutil:subject>
<nmwg:eventType>utilization</nmwg:eventType>
</nmwg:metadata>
<nmwg:metadata id="meta2">
<netutil:subject
xmlns:netutil="http://ggf.org/ns/nmwg/characteristic/utilization/2.0/";
id="subj2">
<nmwgt:interface xmlns:nmwgt="http://ggf.org/ns/nmwg/topology/2.0/";>
<nmwgt:hostName>PoP-SOF</nmwgt:hostName>
<nmwgt:ifName>Fa0/0</nmwgt:ifName>
<nmwgt:ifDescription>SEEREN-SOF==ISTF-SOF(1)</nmwgt:ifDescription>
<nmwgt:ifAddress type="ipv4">194.141.252.2</nmwgt:ifAddress>
<nmwgt:direction>in</nmwgt:direction>
<nmwgt:capacity>100000000</nmwgt:capacity>
</nmwgt:interface>
</netutil:subject>
<nmwg:eventType>utilization</nmwg:eventType>
</nmwg:metadata>
<nmwg:data id="data2" metadataIdRef="meta2">
<nmwg:key id="localhost.-5e1feaef:10ce6c59f5e:-4130">
<nmwg:parameters id="param2">
<nmwg:parameter name="dataSource">traffic_in</nmwg:parameter>
<nmwg:parameter
name="file">/var/db/rra/backbone_traffic_in_9.rrd</nmwg:parameter>
</nmwg:parameters>
</nmwg:key>
</nmwg:data>
<nmwg:data id="data1" metadataIdRef="meta1">
<nmwg:key id="localhost.-5e1feaef:10ce6c59f5e:-413c">
<nmwg:parameters id="param1">
<nmwg:parameter name="dataSource">traffic_out</nmwg:parameter>
<nmwg:parameter
name="file">/var/db/rra/backbone_traffic_in_9.rrd</nmwg:parameter>
</nmwg:parameters>
</nmwg:key>
</nmwg:data>
</nmwg:message>

Now imagine that you have another interface, configured with the same
attributes - the client would have no way to distinguish between the different
interfaces and their respective directions and would behave in some
unpredictable way or just return some mismatch error message, which cannot be
acted upon by end users. The same holds true for the cases when you might have
more different metrics for a given interface (or other endpoints). Our feeling
is that the messaging protocol should provide explicit information about the
relationship between a group of metrics and a given endpoint, rather than
expecting clients to deduce this information. In summary, we would prefer to
know in an explicit way that some traffic_in, traffic_out, errors_in,
erros_out, drops_in, drops_out, etc., are related to a particular interface.

<snip>

> My point was asking what happens if the response from the SE is:
>
> a)
> <message>
> <metadata id="X"/>
> <metadata id="Y" metadataIdRef="X"/>
> <data id="1" metadataIdRef="Y"/>
> </message>
>
> b)
> <message>
> <metadata id="X"/>
> <metadata id="Z" metadataIdRef="X"/>
> <data id="2" metadataIdRef="Z"/>
> </message>
>
> Specifically - imagine that the only thing in metadata Y,Z is the time
> selection. Don't you think the client would want to know that the two sets
of
> data are in fact about the same 'interface'? Without having to look at each
and
>
> every parameter in the metadata?

Yes, our feeling is that in some cases this would be beneficial and in others
- a must, especially if we want to avoid data presentation consistency
problems.

<snip>

> Alternatively, for the case I show above, if SE's are required to maintain
> consistent and unique id references, you could return:
>
> <message>
> <metadata id="X"/>
> <metadata id="Y" metadataIdRef="X"/>
> <metadata id="Z" metadataIdRef="X"/>
> <data id="1" metadataIdRef="Y"/>
> <data id="2" metadataIdRef="Z"/>
> </message>
>
> The message handler would be able to determine that the metadata "X" was
> returned by both calls to the service engine because it is keeping track of
the
>
> metadata it will return in a hash table. The duplicate could be ignored and
the
>
> message output at the end. (Coincidentally, the metadata are already held in
a
>
> HashMap in the Message class - so this is pretty much done. I have not
tested,
>
> but it should "just work".)

Well, it looks like the alternative suggested above would solve our problem.
Please note that our remarks are relevant not only to the messaging protocol,
but to the contents of the config file as well.

<snip>

Kind regards,
Vedrin

r1504 - trunk/perfsonar/src/org/perfsonar/commons/messages, svnlog, 08/01/2006
- data/metadata relationships, Jeff W. Boote, 08/01/2006
  - Re: data/metadata relationships, Maciej Glowiak, 08/02/2006
    - Re: [pS-dev] Re: data/metadata relationships, Maciej Glowiak, 08/02/2006
    - Re: data/metadata relationships, Jeff W. Boote, 08/02/2006
      - Re: [pS-dev] Re: data/metadata relationships, Vedrin Jeliazkov, 08/07/2006
        
        Re: [pS-dev] Re: data/metadata relationships, Jeff W. Boote, 08/23/2006
        
        Re: [pS-dev] Re: data/metadata relationships, Jeff W. Boote, 08/23/2006

List archive

Re: [pS-dev] Re: data/metadata relationships