Skip to Content.
Sympa Menu

perfsonar-dev - nmwg: r285 - trunk/nmwg/doc/dLS

Subject: perfsonar development work

List archive

nmwg: r285 - trunk/nmwg/doc/dLS


Chronological Thread 
  • From:
  • To: ,
  • Subject: nmwg: r285 - trunk/nmwg/doc/dLS
  • Date: Mon, 8 Oct 2007 15:32:38 -0400

Author: zurawski
Date: 2007-10-08 15:32:37 -0400 (Mon, 08 Oct 2007)
New Revision: 285

Modified:
trunk/nmwg/doc/dLS/dLS.html
trunk/nmwg/doc/dLS/dLS.pdf
Log:
Updating ouptut documents after recent updates.

-jason



Modified: trunk/nmwg/doc/dLS/dLS.html
===================================================================
--- trunk/nmwg/doc/dLS/dLS.html 2007-10-08 14:12:12 UTC (rev 284)
+++ trunk/nmwg/doc/dLS/dLS.html 2007-10-08 19:32:37 UTC (rev 285)
@@ -297,7 +297,7 @@
<link rel="Author" href="#rfc.authors">
<link rel="Copyright" href="#rfc.copyright">
<link rel="Chapter" title="1 Introduction" href="#rfc.section.1">
-<link rel="Chapter" title="2 system Specific Operation"
href="#rfc.section.2">
+<link rel="Chapter" title="2 System Specific Operation"
href="#rfc.section.2">
<link rel="Chapter" title="3 Bootstrapping" href="#rfc.section.3">
<link rel="Chapter" title="4 structures and Messages" href="#rfc.section.4">
<link rel="Chapter" title="5 Appendices" href="#rfc.section.5">
@@ -369,7 +369,8 @@
<h1 id="rfc.section.1" class="np">
<a href="#rfc.section.1">1.</a>&nbsp;<a id="intro"
href="#intro">Introduction</a>
</h1>
-<p id="rfc.section.1.p.1">This document describes the Distributed Lookup
Service in the perfSONAR (pS) system. This functionality extends the basic
Lookup Service (LS) functionality that has been present in the system for
some time. The basic LS supports the storing and querying of perfSONAR
Service information as well as metadata about data stored or gathered by a
particular pS service instance.</p>
+<p id="rfc.section.1.p.1">This document describes the Distributed Lookup
Service (dLS) in the perfSONAR (pS) system. This functionality extends the
basic Lookup Service (LS) functionality that has been present in the system
for some time. The basic LS supports the storing and querying of perfSONAR
Service information as well as metadata about data stored or gathered by a
particular pS service instance.</p>
+<p id="rfc.section.1.p.2">From clients' perspective, Lookup Service
operation involves registration, deregistration, querying and obtaining query
results. Clients want to discover the services that are running in the
network. LS enables this by gathering information from the services and then
using it to fulfill client queries. Figure below presents basic LS
interactions.</p>
<div id="ls-op">
</div>
<div id="rfc.figure.1">
@@ -377,7 +378,7 @@
<p>LS Operation</p>
<pre>
_____ __________
-| | Register/De-register | Service |
+| | Register/De-register | |
| LS | &lt;------------------------&gt; | Service |
|_____| &lt;--------- |__________|
| _________________
@@ -388,29 +389,29 @@
</pre>
<p>Services interacting with an LS</p>
<p class="figure">Figure 1</p>
-<p id="rfc.section.1.p.3">This document describes the support necessary to
extend this service to a distributed mode of operation. There are a few key
facets of this mode of operation:</p>
+<p id="rfc.section.1.p.4">This document describes the support necessary to
extend basic Lookup Service to a distributed mode of operation. dLS reduces
overhead in a single service and maintains consistent information. There are
a few key facets of this mode of operation:</p>
<ul>
<li>Summarization - to reduce the amount of information sent over the
network or to anonymize sensitive data, some form of data reduction must take
place.</li>
<li>Scope - to enable a hierarchy of systems, some form of scoping must
exist that forms logical communication and data exchange channels.</li>
<li>Search - information location is key and the way in which distributed
location and search is handled is the crux of this service.</li>
</ul>
-<p id="rfc.section.1.p.4">Additionally we present solutions to issues
related to allow seamless operation of this service including bootstrapping
and domain specific concerns.</p>
+<p id="rfc.section.1.p.5">Additionally we present solutions to issues
related to allow seamless operation of this service including bootstrapping
(i.e. how service find other parts of the system) and domain specific
concerns.</p>
<hr class="noprint">
<h1 id="rfc.section.2" class="np">
-<a href="#rfc.section.2">2.</a>&nbsp;<a id="system" href="#system">system
Specific Operation</a>
+<a href="#rfc.section.2">2.</a>&nbsp;<a id="system" href="#system">System
Specific Operation</a>
</h1>
<h2 id="rfc.section.2.1">
<a href="#rfc.section.2.1">2.1</a>&nbsp;<a id="summary"
href="#summary">Summarization</a>
</h2>
-<p id="rfc.section.2.1.p.1">The first step of information flow is when a
service registers with an LS. The service may know the name of an LS via
static configuration (the most common case for legacy deployments), or other
forms of bootstrapping such as multicast may occur. A service registers a
"service metadata" record about itself and full metadata (i.e. containing all
information such as subject, eventType(s), and any parameters, see <a
href="#service-metadata" title="service metadata
example">Section&nbsp;4.1</a>) about stored data it has knowledge of. Such a
record is called Lookup Information (see <a href="#lookup-info" title="Lookup
Information">Section&nbsp;4.2</a>).</p>
+<p id="rfc.section.2.1.p.1">The first step of information flow is when a pS
service registers with an LS. The service may know the name of an LS via
static configuration (the most common case for legacy deployments), or other
forms of bootstrapping such as multicast may occur. A service registers a
"service metadata" record about itself and full metadata (i.e. containing all
information such as subject, eventType(s), and any parameters, see <a
href="#service-metadata" title="service metadata
example">Section&nbsp;4.1</a>) about stored data it has knowledge of. Such a
record is called Lookup Information (see <a href="#lookup-info" title="Lookup
Information">Section&nbsp;4.2</a>).</p>
<p id="rfc.section.2.1.p.2">The idea is to move the metadata from a local
XML data store to a specialized LS with additional searching capabilities.
While a service instance may support limited searching, this is not necessary
as they should be focused on storing or gathering data and leave the lookup
functionality to the LS. Possible exceptions are rapidly changing Metadata
like the most recent timestamp and full details of data stored in long-term
archival MAs.</p>
<p id="rfc.section.2.1.p.3">The LS that a service contacts to register
becomes the "Home LS" (HLS, see <a href="#glossary"
title="Glossary">Section&nbsp;5.1</a>) of that particular service. It is the
responsibility of the HLS to make summary data about the all of the pS
services it knows of available to the larger enterprise and to draw relevant
queries to itself.</p>
-<p id="rfc.section.2.1.p.4">The construction of such a summary is important
to the overall success of this service; summaries must be general enough to
allow for easy creation and exchange but also must retain enough information
to provide a rich query interface able to locate the distributed information.
We start by making an observation that summarization is best based on scope,
simply put this means that we should attempt to summarize the "most" the
"further" away from the source that we get. This creates a smaller data set
that travels the furthest away while keeping the larger and more informative
data sets closer to the source. We present the strategies as such: </p>
+<p id="rfc.section.2.1.p.4">The construction of such a summary is important
to the overall success of this service; summaries prevents other LS instances
from being overloaded by information, must be general enough to allow for
easy creation and exchange but also must retain enough information to provide
a rich query interface able to locate the distributed information. That means
service metadata information must be filtered (summarized) as it propagates
through the LS cloud. We start by making an observation that summarization is
best based on scope (see also <a href="#scope"
title="Scope">Section&nbsp;2.2</a> for forming scope), simply put this means
that we should attempt to summarize the "most" the "further" away from the
source that we get. This creates a smaller data set that travels the furthest
away while keeping the larger and more informative data sets closer to the
source. We present the strategies as such: </p>
<ul>
<li>Summarization for the "lower scope" (formerly known as "local
scope")</li>
<li>Summarization for the "upper scope" (formerly known as "global
scope")</li>
</ul>
-<p> We limit the discussion in this case to these two scopes, although
extension to "n" scopes is possible. As the number of of scopes increases
additional "aggregation" will be necessary to combine information thus
reducing the size of the data sets further.</p>
+<p> We limit the discussion in this case to these two scopes, although
extension to "n" scopes is possible. As the number of scopes increases
additional "aggregation" will be necessary to combine information thus
reducing the size of the data sets further.</p>
<h3 id="rfc.section.2.1.1">
<a href="#rfc.section.2.1.1">2.1.1</a>&nbsp;<a
id="lower_scope_summarization" href="#lower_scope_summarization">Lower Scope
Summarization</a>
</h3>
@@ -457,7 +458,7 @@
<h3 id="rfc.section.2.1.2">
<a href="#rfc.section.2.1.2">2.1.2</a>&nbsp;<a
id="upper_scope_summarization" href="#upper_scope_summarization">Upper Scope
Summarization</a>
</h3>
-<p id="rfc.section.2.1.2.p.1">A designated member of each lower ring will be
required to interact with the upper level. The mechanics of how we learn who
is the designated leader is discussed in <a href="#tokens" title="Token
Passing">Section&nbsp;2.2.2</a>. The leader of each group (and the designated
backup) will be responsible for examining each member's summary information
and building a summarization/aggregation that best describes the contents of
the ring.</p>
+<p id="rfc.section.2.1.2.p.1">A designated member of each lower ring will be
required to interact with the upper scope. The mechanics of how we learn who
is the designated leader is discussed in <a href="#tokens" title="Token
Passing">Section&nbsp;2.2.2</a>. The leader of each lower scope (and the
designated backup) will be responsible for examining each member's summary
information and building a summarization/aggregation that best describes the
contents of the ring.</p>
<p id="rfc.section.2.1.2.p.2">The most natural summarization is based on the
topology of the network (like in network routing). Thus, topology-based
summarization will include this information as well as eventType information
from the other LSs. Summarization will be performed using specialized summary
algorithm. Topology information such as IP addresses will be summarized using
algorithms basing on radix tree (see <a href="#IP-summary" title="IP
addresses summarization algorithm">Section&nbsp;2.1.2.1</a>).</p>
<p id="rfc.section.2.1.2.p.3">Other information can be summarized in a less
programmatic fashion through the use of either Extensible Stylesheet Language
Transformation (XSLT) documents or the XQuery language as discussed in the
previous section. These mechanisms will take into account the XML elements
that represent the network topology currently used in metadata subjects as
well as additional items such as eventTypes.</p>
<p id="rfc.section.2.1.2.p.4">The output of this process becomes a "service
summary" that represents a breadth of the original input. See <a
href="#LSControl-Summary-lower" title="LS Summary Message
(Lower)">Section&nbsp;4.6</a> or <a href="#LSControl-Summary-upper" title="LS
Summary Message (Upper)">Section&nbsp;4.7</a> for a mock-up of the summary
output. Additional transformations, while aggressive, will strive to preserve
as much information as possible to remain useful during the search
procedures.</p>
@@ -473,11 +474,12 @@
</ul>
<p id="rfc.section.2.1.2.1.p.3">Once constructed, it is possible to consult
the structure in creating IP summaries as well as constructing information
regarding netmasks.</p>
<h2 id="rfc.section.2.2">
-<a href="#rfc.section.2.2">2.2</a>&nbsp;<a id="scope" href="#scope">scope</a>
+<a href="#rfc.section.2.2">2.2</a>&nbsp;<a id="scope" href="#scope">Scope</a>
</h2>
-<p id="rfc.section.2.2.p.1">The next question is how to form the lower and
upper scopes. The simplest answer is that the lower scope be formed based on
the domain name of the participating systems. That would allow e.g.
internet2.edu, geant2.net, and pionier.gov.pl to potentially operate more
than one LS instance inside their own domains (for performance and
scalability.) As LS instances come online they will invoke bootstrapping
procedures to find and join a lower scoped group first.</p>
-<p id="rfc.section.2.2.p.2">The scopes should be named based on URIs. This
will allow a domain-level scope to take the form <a
href="http://internet2.edu";>http://internet2.edu</a>, with subdomain scopes
named <a href="http://internet2.edu/foo";>http://internet2.edu/foo</a>, etc.
The top-level scope can be called <a
href="http://perfsonar.net";>http://perfsonar.net</a> with potential for
geographic divisions later if necessary for performance (such as <a
href="http://eu.perfsonar.net";>http://eu.perfsonar.net</a>).</p>
-<p id="rfc.section.2.2.p.3">The major algorithms used to form and maintain
the ring structure of the dLS, no matter which scope we are talking about,
are as follows: </p>
+<p id="rfc.section.2.2.p.1">The architecture of the dLS protocol assumes the
existence of logical rings of LS instances. The current proposal involves two
levels of rings: lower scope and upper scope.</p>
+<p id="rfc.section.2.2.p.2">The next question is how to form the lower and
upper scopes. The simplest answer is that the lower scope be formed based on
the domain name of the participating systems. That would allow e.g.
internet2.edu, geant2.net, and pionier.gov.pl to potentially operate more
than one LS instance inside their own domains (for performance and
scalability.) As LS instances come online they will invoke bootstrapping
procedures to find and join a lower scoped group first.</p>
+<p id="rfc.section.2.2.p.3">The scopes should be named based on URIs. This
will allow a domain-level scope to take the form <a
href="http://internet2.edu";>http://internet2.edu</a>, with subdomain scopes
named <a href="http://internet2.edu/foo";>http://internet2.edu/foo</a>, etc.
The top-level scope can be called <a
href="http://perfsonar.net";>http://perfsonar.net</a> with potential for
geographic divisions later if necessary for performance (such as <a
href="http://eu.perfsonar.net";>http://eu.perfsonar.net</a>).</p>
+<p id="rfc.section.2.2.p.4">The major algorithms used to form and maintain
the ring structure of the dLS, no matter which scope we are talking about,
are as follows: </p>
<ul>
<li>Join Procedure</li>
<li>Token Passing</li>
@@ -508,7 +510,7 @@
| |
__V__ _____ _V___
| | | | (1) | |
-| LS3 | &lt;----------------- | LS2 | &lt;----------------&gt; | LS1 |
+| LS3 | &lt;----------------- | LS2 | &lt;________________&gt; | LS1 |
|_____| |_____| (2,3) |_____|
| ^ ^ ^ ^
| | | | |
@@ -521,7 +523,7 @@

</pre>
<p class="figure">Figure 3</p>
-<p> </p>
+<p> <p id="rfc.section.2.2.1.1.p.2">Let's assume LS2, LS3 and LS4 are in
the ring. LS1 wants to join the dLS cloud.</p> </p>
<dl class="empty">
<dd>1. LS1 - "candidate" sends LSControlRequest message with the
http://perfsonar.net/services/LS/join eventType to selected LS2 which is
already a member of the ring</dd>
<dd>2. LS2 receives join message from L1 and decides whether to accept it or
not. A security policy may occur here</dd>
@@ -538,7 +540,7 @@
<a href="#rfc.section.2.2.2">2.2.2</a>&nbsp;<a id="tokens"
href="#tokens">Token Passing</a>
</h3>
<p id="rfc.section.2.2.2.p.1">The "token" is a message (see <a
href="#LSControl-Token" title="LS Token Message">Section&nbsp;4.5</a>) meant
to be passed around an LSRing to the various members in some order. There are
various criterion that can be used in deciding how to order the ring so that
everyone can predict where the token is, when they might expect to get it,
and whom they should get it from/ pass it to next. It is important that we
choose a sound method that is simple to calculate, and should use as much
"knowledge" of the ring as possible without burdening the LS instances too
much with complex calculations.</p>
-<p id="rfc.section.2.2.2.p.2">A common method used in P2P (see <a
href="#glossary" title="Glossary">Section&nbsp;5.1</a>) systems such as
Gnutella (see <a href="#glossary" title="Glossary">Section&nbsp;5.1</a>) when
forming "ultrapeers" is to consider the size of the data that a node is
serving. The principle, as described <a
href="http://rakjar.de/gnufu/index.php/GnuFU_en#Network_model:_Change_who_calls_whom:_Ultrapeers_and_Leafs";>here</a>,
alludes to the fact that nodes with less content to look after (i.e. less
services, or services with a smaller amount of data) can spend more time and
effort helping the enterprise as a whole by taking on additional roles (such
as serving as leaders). As such we will record the number of metadata
elements that register with each LS and share this with our friends in the
form of the "contentSize" parameter.</p>
+<p id="rfc.section.2.2.2.p.2">A common method used in P2P (see <a
href="#glossary" title="Glossary">Section&nbsp;5.1</a>) systems such as
Gnutella when forming "ultrapeers" is to consider the size of the data that a
node is serving. The principle, as described <a
href="http://rakjar.de/gnufu/index.php/GnuFU_en#Network_model:_Change_who_calls_whom:_Ultrapeers_and_Leafs";>here</a>,
alludes to the fact that nodes with less content to look after (i.e. less
services, or services with a smaller amount of data) can spend more time and
effort helping the enterprise as a whole by taking on additional roles (such
as serving as leaders). As such we will record the number of metadata
elements that register with each LS and share this with our friends in the
form of the "contentSize" parameter.</p>
<p id="rfc.section.2.2.2.p.3">Token passing is directly related to the
concept of leader election (see <a href="#Leader_Election" title="Leader
election">Section&nbsp;2.2.4</a>), so more explanation of this approach will
follow. For now we are justified in saying that the "contentSize" forms a
good criterion for "ordering" the members of the ring. With all members of
the ring aware of everyone's data size, we can easily know who we should pass
the token to, and receive it from at any point in time.</p>
<p id="rfc.section.2.2.2.p.4">The token can be viewed as "permission to
talk" and permits the holder to send it's summary information to all
available LS instances (see <a href="#LSControl-Summary-lower" title="LS
Summary Message (Lower)">Section&nbsp;4.6</a> and <a
href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section&nbsp;4.7</a>). The responses will be parsed to get any
useful updated information.</p>
<p id="rfc.section.2.2.2.p.5">The holder of the token, after completing
summarization, will wait some pre-determined amount of time before sending
the token to the next LS instance. In general the LS instances should not be
overly sensitive to the progression of the token. If each LS instance is
monitoring the progress, and for some reason we have lost the token it may
start a flurry of retransmits and drops that will take cycles to calm down
again.</p>
@@ -546,36 +548,13 @@
<h4 id="rfc.section.2.2.2.1">
<a href="#rfc.section.2.2.2.1">2.2.2.1</a>&nbsp;<a
id="token_passing_algorithm" href="#token_passing_algorithm">Token passing
algorithm</a>
</h4>
-<p id="rfc.section.2.2.2.1.p.1">
-</p>
-<div id="token-example">
-</div>
-<div id="rfc.figure.4">
-</div>
-<p>Illustration of LS Token Passing</p>
-<pre>
-
- _____ _____
-| | | |
-| LS1 | &lt;----------------- | LS2 |
-|_____| |_____|
- | ^
- | |
- | _____ |
- | | | |
- |-------&gt; | LS3 | ---------|
- |_____|
-
-
- </pre>
-<p class="figure">Figure 4</p>
-<p> </p>
+<p id="rfc.section.2.2.2.1.p.1">The algorithm for token passing works as
follows. </p>
<dl class="empty">
<dd>0. When any LS receives the token (LSControlRequest message with the
http://perfsonar.net/services/LS/token/... eventType, we will do the
following:</dd>
-<dd>1.Update local peer list (from token)</dd>
-<dd>2.Send summary to all peers in the lease (excluding itself)</dd>
-<dd>3.Wait for some amount of time</dd>
-<dd>4.Send token to next peer from the list (if it fails, try next one)</dd>
+<dd>1. Update local peer list (from token)</dd>
+<dd>2. Send summary to all peers in the lease (excluding itself)</dd>
+<dd>3. Wait for some amount of time</dd>
+<dd>4. Send token to next peer from the list (if it fails, try next one)</dd>
</dl>
<h3 id="rfc.section.2.2.3">
<a href="#rfc.section.2.2.3">2.2.3</a>&nbsp;<a id="summary-blast"
href="#summary-blast">summarization Notification</a>
@@ -1353,19 +1332,23 @@
<a href="#rfc.section.5.1">5.1</a>&nbsp;<a id="glossary"
href="#glossary">Glossary</a>
</h2>
<ul>
-<li>Service - A Service is an application that communicates with other
perfSONAR Services via standardized protocol set (SOAP+XML+NMWGv2)</li>
+<li>AuthotitativeLS -</li>
+<li>Berkeley DB XML - Oracle Berkeley DB XML is an open source, embeddable
XML database with XQuery-based access to documents stored in containers and
indexed based on their content.</li>
+<li>Bootstraping -</li>
+<li>eXist XML DB - eXist is an Open Source native XML database featuring
efficient, index-based XQuery processing, automatic indexing, extensions for
full-text search, XUpdate support, XQuery update extensions and tight
integration with existing XML development tools.</li>
+<li>Home LS (HLS) - The Home LS of a Service is the LS where the Service
registers its Lookup Information</li>
<li>Lookup Service (LS) - The Lookup Service is a key element of the
perfSONAR framework because it allows every independent service to be a
visible part of the system. New services may identify themselves to the
community and provide their detailed capabilities description. Other services
are able to communicate to the LS in order to get this data called Lookup
Information. Basic Lookup Service supports registration, query, keepalives
and de-registration actions (additionally updates?).</li>
<li>Lookup Information - information registered by a Service in the Lookup
Service</li>
-<li>Summary Information - aggregated information from Lookup Information
that is sent by one LS to another</li>
+<li>Lower Scope - The scoping paradigm meant to indicate inter-domain
relationships.</li>
+<li>LSRing -</li>
<li>Multidomain / Distributed Lookup Information (mLS) - Lookup Service
which supports summarization and communication with other Lookup Services
(which might be in the same domain...)</li>
-<li>Home LS (HLS) - The Home LS of a Service is the LS where the Service
registers its Lookup Information</li>
+<li>P2P -</li>
+<li>Service - A Service is an application that communicates with other
perfSONAR Services via standardized protocol set (SOAP+XML+NMWGv2)</li>
+<li>Summary Information - aggregated information from Lookup Information
that is sent by one LS to another</li>
+<li>Token Ring - A ring network in which the network topology features nodes
connected to exactly two other nodes, forming a circular pathway for signals:
a ring. Data travels from node to node, with each node handling every packet.
We use a logical ring in which a "token" message is used to synchronize the
communication among the nodes.</li>
<li>Upper (Global) Scope - The scoping paradigm meant to indicate
intra-domain relationships.</li>
-<li>Lower (Local) Scope - The scoping paradigm meant to indicate
inter-domain relationships.</li>
<li>XSLT - Extensible Stylesheet Language Transformations is an XML-based
language used for the transformation of XML documents into other XML or
"human-readable" documents. The original document is not changed; rather, a
new document is created based on the content of an existing one.</li>
<li>XQuery - A query language (with some programming language features) that
is designed to query collections of XML data. It is semantically similar to
SQL.</li>
-<li>Token Ring - A ring network in which the network topology features nodes
connected to exactly two other nodes, forming a circular pathway for signals:
a ring. Data travels from node to node, with each node handling every packet.
We use a logical ring in which a "token" message is used to synchronize the
communication among the nodes.</li>
-<li>Berkeley DB XML - Oracle Berkeley DB XML is an open source, embeddable
XML database with XQuery-based access to documents stored in containers and
indexed based on their content.</li>
-<li>eXist XML DB - eXist is an Open Source native XML database featuring
efficient, index-based XQuery processing, automatic indexing, extensions for
full-text search, XUpdate support, XQuery update extensions and tight
integration with existing XML development tools.</li>
</ul>
<h1 class="np" id="rfc.references">
<a href="#rfc.section.6">6.</a> References</h1>

Modified: trunk/nmwg/doc/dLS/dLS.pdf
===================================================================
(Binary files differ)



  • nmwg: r285 - trunk/nmwg/doc/dLS, svnlog, 10/08/2007

Archive powered by MHonArc 2.6.16.

Top of Page