Skip to Content.
Sympa Menu

perfsonar-dev - Re: data/metadata relationships

Subject: perfsonar development work

List archive

Re: data/metadata relationships


Chronological Thread 
  • From: "Jeff W. Boote" <>
  • To: Maciej Glowiak <>
  • Cc:
  • Subject: Re: data/metadata relationships
  • Date: Wed, 02 Aug 2006 10:33:10 -0600

Maciej Glowiak wrote:
Jeff,

The solution with _number has been used for more than 2 months.

I can see that it has been used by you. That is not the point.

The point is that I am concerned that this situation can cause confusion for clients. I asked everyone - most especially client developers - if having multiple references to the same information was confusing.

The value of a community software project, is community review. I did not say what you had done was bad. I specifically said I was concerned with one of the implications. If others are not concerned, I will drop the issue. But, I want to make sure others (especially client developers that are under-represented in our development group) understand the implications. Don't you want the services as usable by clients as possible?

1. As you remember, we didn't have generic Message Handler, so everybody wrote his own MH. I had to write one for Lookup Service and tried to make it quite generic. Then Roman used it for his MAs, but of course everybody may use his own MH for his own service.

I just asked if anyone else was concerned that the data/metadata relationships were not being preserved in this methodology. If no one else is concerned, I'll drop the issue. The fact that others are copying your work is all the more reason to discuss the implications, is it not?

2. Your case with one common metadata:

--------------------------------------------------
<message>

<metadata id="a"/>
<metadata id="b" metadataIdRef="a"/>
<metadata id="c" metadataIdRef="c"/>

<data metadataIdRef="b"/>
<data metadataIdRef="c"/>

</message>
--------------------------------------------------

must be split for two sub-requests if we want to run Service Engine
separate for each data trigger. That's work for both MAs now and for
LS (except LSRegister which has its own simple Message Handler)

3. Service output is not divided into pieces, so request message from
pt.2 will cause running Service Engine two times:

a) request to SE
--------------------------------------------------
<message>
<metadata id="a"/>
<metadata id="b" metadataIdRef="a"/>
<data metadataIdRef="b"/>
</message>
--------------------------------------------------

b) request to SE
--------------------------------------------------
<message>
<metadata id="a"/>
<metadata id="c" metadataIdRef="c"/>
<data metadataIdRef="c"/>
</message>
--------------------------------------------------

and Service Engine will return similar set of metadatas and datas
with the same identifiers, for instance:

a) response from SE
--------------------------------------------------
<message>
<metadata id="X"/>
<data metadataIdRef="X"/>
</message>
--------------------------------------------------

b) response from SE
--------------------------------------------------
<message>
<metadata id="X"/>
<data metadataIdRef="X"/>
</message>
--------------------------------------------------

My point was asking what happens if the response from the SE is:

a)
<message>
<metadata id="X"/>
<metadata id="Y" metadataIdRef="X"/>
<data id="1" metadataIdRef="Y"/>
</message>

b)
<message>
<metadata id="X"/>
<metadata id="Z" metadataIdRef="X"/>
<data id="2" metadataIdRef="Z"/>
</message>

Specifically - imagine that the only thing in metadata Y,Z is the time selection. Don't you think the client would want to know that the two sets of data are in fact about the same 'interface'? Without having to look at each and every parameter in the metadata?

Now, ho do you combine these two into one response message? By
changing identifiers...

So the response message will look like:

--------------------------------------------------
<message>
<metadata id="X_1"/>
<data metadataIdRef="X_1"/>

<metadata id="X_2"/>
<data metadataIdRef="X_2"/>
</message>
--------------------------------------------------

Alternatively, for the case I show above, if SE's are required to maintain consistent and unique id references, you could return:

<message>
<metadata id="X"/>
<metadata id="Y" metadataIdRef="X"/>
<metadata id="Z" metadataIdRef="X"/>
<data id="1" metadataIdRef="Y"/>
<data id="2" metadataIdRef="Z"/>
</message>

The message handler would be able to determine that the metadata "X" was returned by both calls to the service engine because it is keeping track of the metadata it will return in a hash table. The duplicate could be ignored and the message output at the end. (Coincidentally, the metadata are already held in a HashMap in the Message class - so this is pretty much done. I have not tested, but it should "just work".)

And as I said, it works fine. Of course Message handler may be more
intelligent, but it'd need big effort to implement it, and the final
effect may also not be satisfied.

The intelligence I suggest in the message handler is only a hash of metadata id -> metadata element. This would ensure no duplicates are put in the outgoing message. This is trivial (and done in the Message class already).

The question I asked was how difficult it would be to maintain consistency and uniqueness of metadata id's in the service engine. That is where any additional complexity would be. But, for most SE's I would imagine this would not be too difficult. For the LS I'm sure XML DB's have auto generated ID's for individual elements in the database that could be used for the id. For the RRD MA this would just be an id assigned when the config file is loaded (can be automatic for the XML DB method I believe).

We have often discussed the id's and references and questioned how unique they needed to be. I'm just suggesting reasons to make them very unique (and stable).

jeff



Archive powered by MHonArc 2.6.16.

Top of Page