perfsonar-dev - nmwg: r309 - trunk/nmwg/doc/dLS
Subject: perfsonar development work
List archive
- From:
- To: ,
- Subject: nmwg: r309 - trunk/nmwg/doc/dLS
- Date: Fri, 14 Dec 2007 07:22:25 -0500
Author: zurawski
Date: 2007-12-14 07:22:22 -0500 (Fri, 14 Dec 2007)
New Revision: 309
Removed:
trunk/nmwg/doc/dLS/LSRing.xml
Modified:
trunk/nmwg/doc/dLS/LSControl-LeaderRequest.xml
trunk/nmwg/doc/dLS/LSControl-SummaryRequest.xml
trunk/nmwg/doc/dLS/LSControl-TokenRequest.xml
trunk/nmwg/doc/dLS/LSControl-TokenResponse.xml
trunk/nmwg/doc/dLS/LSRing-lower.xml
trunk/nmwg/doc/dLS/LSRing-upper.xml
trunk/nmwg/doc/dLS/dLS.html
trunk/nmwg/doc/dLS/dLS.pdf
trunk/nmwg/doc/dLS/dLS.xml
Log:
Fixing some 'in progress comments' in the document (eliminating many due
to the answers being present now), adjusting the language in some of the
algorithms to better reflect reality.
Language cleanups and flow adjustments.
Fixing several examples.
-jason
Modified: trunk/nmwg/doc/dLS/LSControl-LeaderRequest.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSControl-LeaderRequest.xml 2007-12-14 11:12:29
UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSControl-LeaderRequest.xml 2007-12-14 12:22:22
UTC (rev 309)
@@ -34,28 +34,14 @@
<nmwg:metadata>
<summary:subject xmlns:summary="http://ggf.org/ns/nmwg/summary/2.0/">
<nmtl3:network>
- <nmtl3:subnet>128.4.10.0</nmtl3:subnet>
- <nmtl3:netmask>255.255.255.0</nmtl3:netmask>
- <nmtl3:asn>666</nmtl3:asn>
+ <nmtl3:ipAddress>128.4.10.0/16</nmtl3:ipAddress>
</nmtl3:network>
</summary:subject>
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
</nmwg:metadata>
-
- <nmwg:metadata>
- <nmtopo:subject>
- <nmtopo:node>
- <nmtopo:location>
- <nmtopo:country>USA</nmtopo:country>
- </nmtopo:location>
- </nmtopo:node>
- </nmtopo:subject>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
- </nmwg:metadata>
+
</nmwg:data>
<!-- we could go on and on like that... -->
Modified: trunk/nmwg/doc/dLS/LSControl-SummaryRequest.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSControl-SummaryRequest.xml 2007-12-14 11:12:29
UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSControl-SummaryRequest.xml 2007-12-14 12:22:22
UTC (rev 309)
@@ -17,8 +17,6 @@
<summary:subject xmlns:summary="http://ggf.org/ns/nmwg/summary/2.0/">
<nmtl3:network>
<nmtl3:ipAddress>128.4.10.0/16</nmtl3:ipAddress>
- <!-- Optional ASN -->
- <nmtl3:asn>666</nmtl3:asn>
</nmtl3:network>
</summary:subject>
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
@@ -26,18 +24,6 @@
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
</nmwg:metadata>
- <nmwg:metadata>
- <nmtopo:subject>
- <nmtopo:node>
- <nmtopo:location>
- <nmtopo:country>USA</nmtopo:country>
- </nmtopo:location>
- </nmtopo:node>
- </nmtopo:subject>
- <nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
- </nmwg:metadata>
</nmwg:data>
</nmwg:message>
Modified: trunk/nmwg/doc/dLS/LSControl-TokenRequest.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSControl-TokenRequest.xml 2007-12-14 11:12:29
UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSControl-TokenRequest.xml 2007-12-14 12:22:22
UTC (rev 309)
@@ -25,6 +25,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
Modified: trunk/nmwg/doc/dLS/LSControl-TokenResponse.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSControl-TokenResponse.xml 2007-12-14 11:12:29
UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSControl-TokenResponse.xml 2007-12-14 12:22:22
UTC (rev 309)
@@ -12,6 +12,7 @@
<nmwg:eventType>http://perfsonar.net/services/LS/token/lower/success</nmwg:eventType>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
Modified: trunk/nmwg/doc/dLS/LSRing-lower.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSRing-lower.xml 2007-12-14 11:12:29 UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSRing-lower.xml 2007-12-14 12:22:22 UTC (rev 309)
@@ -13,6 +13,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">0</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -26,7 +27,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -40,7 +42,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">1</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -54,7 +57,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
Modified: trunk/nmwg/doc/dLS/LSRing-upper.xml
===================================================================
--- trunk/nmwg/doc/dLS/LSRing-upper.xml 2007-12-14 11:12:29 UTC (rev 308)
+++ trunk/nmwg/doc/dLS/LSRing-upper.xml 2007-12-14 12:22:22 UTC (rev 309)
@@ -12,7 +12,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -27,6 +28,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">1</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
Deleted: trunk/nmwg/doc/dLS/LSRing.xml
Modified: trunk/nmwg/doc/dLS/dLS.html
===================================================================
--- trunk/nmwg/doc/dLS/dLS.html 2007-12-14 11:12:29 UTC (rev 308)
+++ trunk/nmwg/doc/dLS/dLS.html 2007-12-14 12:22:22 UTC (rev 309)
@@ -298,11 +298,9 @@
<link rel="Copyright" href="#rfc.copyright">
<link rel="Chapter" title="1 Introduction" href="#rfc.section.1">
<link rel="Chapter" title="2 System Specific Operation"
href="#rfc.section.2">
-<link rel="Chapter" title="3 Bootstrapping" href="#rfc.section.3">
-<link rel="Chapter" title="4 Structures and Messages" href="#rfc.section.4">
-<link rel="Chapter" title="5 Result codes" href="#rfc.section.5">
-<link rel="Chapter" title="6 Appendices" href="#rfc.section.6">
-<link rel="Chapter" href="#rfc.section.7" title="7 References">
+<link rel="Chapter" title="3 Structures and Messages" href="#rfc.section.3">
+<link rel="Chapter" title="4 Appendices" href="#rfc.section.4">
+<link rel="Chapter" href="#rfc.section.5" title="5 References">
<meta name="generator"
content="http://greenbytes.de/tech/webdav/rfc2629.xslt, Revision 1.291,
2006/10/29 09:03:19, XSLT vendor: SAXON 8.8 from Saxonica
http://www.saxonica.com/">
<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/">
<meta name="DC.Creator" content="Boote, J">
@@ -394,11 +392,10 @@
<p id="rfc.section.1.p.5">
</p>
<ul>
+<li>Scope - to enable a hierarchy of systems, some form of scoping must
exist that defines local and remote communication groups. Issues regarding
initial discovery, "boostrapping", are discussed in the context of
domain-specific issues.</li>
<li>Summarization - to reduce the amount of information sent over the
network or to anonymize sensitive data, some form of data reduction must take
place.</li>
-<li>Scope - to enable a hierarchy of systems, some form of scoping must
exist that defines local and remote communication groups.</li>
<li>Search - information location is key and the way in which distributed
location and search is handled is the crux of this service.</li>
</ul>
-<p id="rfc.section.1.p.6">Additionally we present solutions to issues
necessary to allow effective operation of this service including
bootstrapping (i.e. how service finds other parts of the system) and
domain-specific concerns.</p>
<hr class="noprint">
<h1 id="rfc.section.2" class="np">
<a href="#rfc.section.2">2.</a> <a id="system" href="#system">System
Specific Operation</a>
@@ -406,34 +403,36 @@
<h2 id="rfc.section.2.1">
<a href="#rfc.section.2.1">2.1</a> <a id="overview"
href="#overview">Overview</a>
</h2>
-<p id="rfc.section.2.1.p.1">The first step of information flow is when a pS
service registers with an LS. The service may know the name of an LS via
static configuration (the most common case for legacy deployments), or other
forms of bootstrapping such as multicast may occur. A service registers a
"service metadata" record about itself and full metadata (i.e. containing all
information such as subject, eventType(s), and any parameters, see <a
href="#service-metadata" title="Service metadata
example">Section 4.1</a>) about stored data it has knowledge of. Such a
record is called Lookup Information (see <a href="#lookup-info" title="Lookup
Information">Section 4.2</a>).</p>
+<p id="rfc.section.2.1.p.1">The first step of information flow is when a pS
service registers with an LS. The service may know the name of an LS via
static configuration (the most common case for legacy deployments) A service
registers a "service metadata" record about itself and full metadata (i.e.
containing all information such as subject, eventType(s), and any parameters,
see <a href="#service-metadata" title="Service metadata
example">Section 3.1</a>) about stored data it has knowledge of. Such a
record is called Lookup Information (see <a href="#lookup-info" title="Lookup
Information">Section 3.2</a>).</p>
<p id="rfc.section.2.1.p.2">The idea is to move the metadata from a
service-local XML data store to a specialized LS with additional searching
capabilities. While a service instance may support limited searching, this is
not necessary as they should be focused on storing or gathering data and
leave the lookup functionality to the LS. Possible exceptions when a client
may need to contact a service directly are when metadata rapidly changes,
like the most recent data's timestamp and full details of data stored in
long-term archival MAs.</p>
-<p id="rfc.section.2.1.p.3">The architecture of the dLS protocol assumes the
existence of logical groups of LS instances. The architecture should allow
for multiple levels of these rings representing multiple splits in a
hierarchy, although the basic example that will be an ongoing theme in this
document will revolve around only 2 levels. The authors realize it is
impossible to predict how the hierarchy of this service may split over time,
therefore we avoid using language that directly categorizes a ring into a
specific role. In general the two rings that define scope are 'lower' and
'upper'.</p>
-<p id="rfc.section.2.1.p.4">To better define this classification consider an
example at a high level: inter-domain communication. It is natural to assume
that single domain will deploy an LS instance to manage deployed services.
The true goal of perfSONAR is to ease the detection of end to end performance
issues particularly across domain boundaries, therefore communication between
domain LS instances is paramount. We assume for this example that the 'top'
most level is that of the domain; further fragmentation by other factors such
as the 'top-level domain' or geographical considerations are probable, just
not of interest in this work. A single domain may have multiple LS
deployments; a representative 'leader' from this set will represent the
'upper' (intra-domain) scope and communicate with similar LS instances of
other domains in this case. The actual registered services of the LS
represent the 'lower' (local, or in many cases inter-domain) scope.</p>
-<p id="rfc.section.2.1.p.5">The scoping designations are important to the
next stage: data reduction. We observe that the abundance of information
available via the original metadata description is rather obtuse when it
comes to answering a simple (and common) query such as 'give me bandwidth
data for host x'. Although information such as capacity or interface name is
valuable internal to a domain, it does not serve much purpose to NOC staff
simply asking to see utilization of a link. We propose a 'summarization'
strategy based on 'distance' from the source that will distill the complete
metadata into smaller and smaller sets as the information is passed through
the scope hierarchy.</p>
-<p id="rfc.section.2.1.p.6">Finally, using the scoping and summarizing steps
we come to final, and arguably most important phase: search. Search must rely
on two phases to work efficiently in the dLS framework, namely discovery and
query. The first step is locating 'where' information can be found. This
involves asking semi direct questions regarding the well defined network
topology in order to locate the 'vicinity' of data. The query phase will then
ask more esoteric questions once it locates the proper LS instances to ask.
The discovery phase is made possible through the process of summarization,
while the query phase remains similar to the current LS functionality.</p>
+<p id="rfc.section.2.1.p.3">The architecture of the dLS protocol assumes the
existence of logical groups of LS instances. The architecture should allow
for multiple levels representing splits in a hierarchy, although the basic
example in this document will revolve around only 2 levels. The authors
realize it is impossible to predict how the hierarchy of this service may
split over time, therefore we avoid using language that directly categorizes
a ring into a specific role. In general the two levels that define scope are
"lower" and "upper".</p>
+<p id="rfc.section.2.1.p.4">To better define this classification consider an
example at a high level: inter-domain communication. It is natural to assume
that single domain will deploy an LS instance to manage deployed services.
The true goal of perfSONAR is to ease the detection of end to end performance
issues particularly across domain boundaries, therefore communication between
domain LS instances is paramount. We assume for this example that the "top"
most level is that of the domain; further fragmentation by other factors such
as the "top-level domain" or geographical considerations are probable, just
not of interest in this work. A single domain may have multiple LS
deployments; a representative "leader" from this set will represent the
"upper" (intra-domain) scope and communicate with similar LS instances of
other domains in this case. The actual registered services of the LS
represent the "lower" (local, or in many cases inter-domain) scope.</p>
+<p id="rfc.section.2.1.p.5">The scoping designations are important to the
next stage: data reduction. We observe that the abundance of information
available via the original metadata description is rather obtuse when it
comes to answering a simple (and common) query such as "give me bandwidth
data for host x". Although information such as capacity or interface name is
valuable internal to a domain, it does not serve much purpose to NOC staff
simply asking to see utilization of a link. We propose a "summarization"
strategy based on "distance" from the source that will distill the complete
metadata into smaller and smaller sets as the information is passed through
the scope hierarchy.</p>
+<p id="rfc.section.2.1.p.6">Finally, using the scoping and summarizing steps
we come to final, and arguably most important phase: search. Search must rely
on two phases to work efficiently in the dLS framework, namely discovery and
query. The first step is locating "where" information can be found. This
involves asking semi direct questions regarding the well defined network
topology in order to locate the "vicinity" of data. The query phase will then
ask more esoteric questions once it locates the proper LS instances to ask.
The discovery phase is made possible through the process of summarization,
while the query phase remains similar to the current LS functionality.</p>
<h2 id="rfc.section.2.2">
<a href="#rfc.section.2.2">2.2</a> <a id="scope" href="#scope">Scope
Formation</a>
</h2>
-<p id="rfc.section.2.2.p.1">The next question is how to form the hierarchy
of LS instances and subsequently organize the 'scopes'. The simplest answer
is that the highest scope be formed based on the domain name of the
participating systems as mentioned in the previous examples. That would allow
e.g. internet2.edu, geant2.net, and pionier.gov.pl to potentially operate
more than one LS instance inside their own domains (for performance and
scalability.) As LS instances come online they will invoke bootstrapping
procedures to find and join a lower scoped group first.</p>
+<p id="rfc.section.2.2.p.1">The first aspect of this system is how to form
the hierarchy of LS instances and subsequently organize the scopes. The
simplest answer is that the highest scope be formed based on the domain name
of the participating systems as mentioned in the previous examples. That
would allow e.g. internet2.edu, geant2.net, and pionier.gov.pl to potentially
operate more than one LS instance inside their own domains (for performance
and scalability.) As LS instances come online they will invoke bootstrapping
procedures to find and join a lower scoped group first. The LS that a service
contacts to register becomes the "Home LS" (HLS, see <a href="#glossary"
title="Glossary">Section 4.1</a>) of that particular service.</p>
<p id="rfc.section.2.2.p.2">The scopes should be named based on URIs. This
will allow a domain-level scope to take the form <a
href="http://internet2.edu">http://internet2.edu</a>, with subdomain scopes
named <a href="http://internet2.edu/foo">http://internet2.edu/foo</a>, etc.
The top-level scope can be called <a
href="http://perfsonar.net">http://perfsonar.net</a> with potential for
geographic divisions later if necessary for performance (such as <a
href="http://eu.perfsonar.net">http://eu.perfsonar.net</a>).</p>
<p id="rfc.section.2.2.p.3">The major algorithms used to form and maintain
the ring structure of the dLS, no matter which scope we are talking about,
are as follows:</p>
<p id="rfc.section.2.2.p.4">
</p>
<ul>
-<li>Join Procedure</li>
-<li>Token Passing</li>
-<li>Summarization Notification</li>
+<li>
+<a href="#join" title="Join Procedure">Section 2.2.1</a> - Join
Procedure</li>
+<li>
+<a href="#tokens" title="Token Messages for Control and
Election">Section 2.2.2</a> - Token Passing</li>
+<li>
+<a href="#summary_messages" title="Summarization
Messages">Section 2.2.3</a> - Summarization Notification</li>
</ul>
<p id="rfc.section.2.2.p.5">Each of these procedures is important to keeping
members of the distributed "service" functioning correctly. The algorithms
will be presented in the frame of HLS instances communicating in a lower
scope, but will be used in the same manner for inter-domain communication as
an upper scope as well.</p>
<h3 id="rfc.section.2.2.1">
<a href="#rfc.section.2.2.1">2.2.1</a> <a id="join" href="#join">Join
Procedure</a>
</h3>
-<p id="rfc.section.2.2.1.p.1">When an LS instance comes online it will have
some bootstrapping knowledge of potential peers (both inter and intra
domain). This information is contained in LSRing file (see <a href="#LSRing"
title="LS Ring File Structure">Section 4.3</a>). The inter-domain
knowledge is used first to establish a connection to an already in progress
ring, or perhaps to start a ring that may not exist yet.</p>
-<p id="rfc.section.2.2.1.p.2">A candidate LS will continuously search its
LSRing information and send an LSControl message to its known LS instances
with a "join" eventType (see <a href="#LSControl-Join" title="LS Joining
Message for Joining a Ring">Section 4.4</a>) until a successful response
is seen. The LS candidate will then search through the successful
LSControlResponse to this message and update its LSRing with the returned
information. This can mean updating the "active" parameter as well as adding
new LS instances. This parameter is indicative of the "live-ness" (i.e. were
we successful in contacting it recently). The contacted LS will also update
the local copy of LSRing to add the new member to its "available" list.</p>
-<p id="rfc.section.2.2.1.p.3">For security purposes, it is necessary for the
members of the LSRing to know that a new member has joined without that
member authenticating pairwise with each other member of the ring. To
accomplish this, the initially contacted LS will request that the current
ring leader initiate a token rotation to allow all members to update their
LSRing list.</p>
-<p id="rfc.section.2.2.1.p.4">After updating, the newly joined LS will
broadcast another LSControl message with a "summary" eventType (see <a
href="#LSControl-Summary-lower" title="LS Summary Message
(Lower)">Section 4.6</a>, or if we are dealing with the upper level see
<a href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section 4.7</a>) to all of the "active" LS instances from its
LSRing. Again the responses will be parsed to get any useful updated
information. At the end of this process the joining LS will possess an LSRing
file reflecting the state of the dLS cloud. Each of the recipient LS
instances which hasn't heard anything from this joining LS previously will do
the same, including adding this new member to their own lists (as they didn't
know of it's existence yet).</p>
-<p id="rfc.section.2.2.1.p.5">After this initial warm-up the LS will observe
the rules of token etiquette and remain silent until it is contacted with a
token, or it has not seen one in a very long time (see <a href="#tokens"
title="Token Messages for Control and Election">Section 2.2.2</a>).</p>
+<p id="rfc.section.2.2.1.p.1">When an LS instance comes online it must have
some knowledge of potential peers (both inter and intra domain). This
information is contained in LSRing files (see <a href="#LSRing" title="LS
Ring File Structure">Section 3.3</a>). The inter-domain knowledge (i.e.
"lower") is used first to establish a connection to an already in progress
ring, or perhaps to start a ring that may not exist yet.</p>
+<p id="rfc.section.2.2.1.p.2">A joining LS will continuously search the
LSRing information and send an LSControl message to known LS instances with a
"join" eventType (see <a href="#LSControl-Join" title="LS Joining Message for
Joining a Ring">Section 3.4</a>) until a successful response is seen.
The LS candidate will then search through the successful LSControlResponse to
this message and update its LSRing information with the returned information.
This can mean updating the "active" and "leader" parameters as well as adding
new LS instances. The first parameter is indicative of the "live-ness" (i.e.
were we successful in contacting it recently), the second is used to indicate
who the current "leader" of this group is. These dynamic variables will be
constantly changing.</p>
+<p id="rfc.section.2.2.1.p.3">For security purposes, it is necessary for the
members of the LSRing to know that a new member has joined without that
member authenticating pairwise with each other member of the ring. To
accomplish this, the initially contacted LS will request that the current
ring leader initiate a token rotation to allow all members to update their
LSRing files thus allowing instant recognition of the new member.</p>
+<p id="rfc.section.2.2.1.p.4">After this initial warm-up the LS will observe
the rules of token etiquette and remain silent until it is contacted with a
token, or it has not seen one in a very long time (see <a href="#tokens"
title="Token Messages for Control and Election">Section 2.2.2</a>).</p>
<h4 id="rfc.section.2.2.1.1">
<a href="#rfc.section.2.2.1.1">2.2.1.1</a> <a id="join_algorithm"
href="#join_algorithm">Join Algorithm</a>
</h4>
@@ -469,7 +468,7 @@
</div>
<div id="rfc.figure.3">
</div>
-<p>Illustration of LS Join Algorithm (rejected)</p>
+<p>Illustration of LS Join Algorithm (accepted)</p>
<pre>
|==========LS Ring=========|
@@ -508,29 +507,26 @@
</p>
<dl class="empty">
<dd>1. LS1 (candidate to the ring) sends join
(http://perfsonar.net/services/LS/join eventType) request to LS2 (member of
the ring). LS2 receives join message from LS1 and decides whether to accept
it or not. Application of security policy may occur here.</dd>
-<dd>2. LS2 accepts join request from LS1 and responses with success code and
LSRing content. LS2 will be waiting for send-summary request
(http://perfsonar.net/services/LS/send-summary eventType)</dd>
+<dd>2. LS2 accepts join request from LS1 and responses with success code and
LSRing content.</dd>
<dd>3. LS2 sends send-update-token
(http://perfsonar.net/services/LS/send-update-token eventType) to LS3 (the
leader of the ring). Send-update-token contain the URL of LS1. LS3 updates
its LSRing with URL of LS1.</dd>
-<dd>4. LS3 immediately sends update-token
(http://perfsonar.net/services/LS/update-token) to next peer from LSRing.
Update-token contains updated LSRing.</dd>
-<dd>5. LS2 receives update-token, updates its LSRing and immediately sends
update-token to the next peer</dd>
-<dd>6. After full cycle of update-token LS3 receives own update-token. Now
all ring members have knowledge about newly joined LS1 and can accept summary
from LS1.</dd>
-<dd>7. LS3 responses for request mentioned in step 3. LS2 receives an
acknowledgement (result code) of update-token operation.</dd>
-<dd>8. If update-token was accomplished succesfuly, LS2 sends send-summary
request (http://perfsonar.net/services/LS/send-summary eventType) to LS1</dd>
-<dd>9. LS1 sends summary to all peers in the LSRing. Now all members of the
LSRing have the summary information from LS1.</dd>
+<dd>4. LS3 immediately sends update-token
(http://perfsonar.net/services/LS/update-token) to next peer from LSRing.
Update-token contains updated LSRing information. This will be exchanged
between all peers. This message should not be marked as a duplicate or
dropped by any members of the ring.</dd>
+<dd>5. LS3 sends a leader election token immediately after this to trigger a
new election cycle. This will also exchanged between all peers.</dd>
+<dd>6. After full cycle of update-token LS3 receives update-token
(identified by messageIdRef) and drops the token. Now all ring members have
knowledge about newly joined LS1.</dd>
+<dd>7. After a full cycle of leader election, a new leader is known. This
new leader may start the backup leader election.</dd>
+<dd>8. Regular token exchange and summarization notification will resume in
time.</dd>
</dl>
-<p id="rfc.section.2.2.1.1.p.6">The algorithm could be simplified by moving
response from step 2 to step 8. However then, LS1 may be waiting for quite a
long time without any reponse and communication time may pass. Such a
simplification should be taken under consideration after testing.</p>
<h3 id="rfc.section.2.2.2">
<a href="#rfc.section.2.2.2">2.2.2</a> <a id="tokens"
href="#tokens">Token Messages for Control and Election</a>
</h3>
-<p id="rfc.section.2.2.2.p.1">When scopes are created they form themselves
into logical rings around which tokens can be passed. These token passing
mechanism is used for two purposes, for registration control and for leader
election. A leader is necessary to circulate group updates, to start tokens
to initiate registration and to represent a given scope in an upper scope.</p>
-<p id="rfc.section.2.2.2.p.2">The "token" is an LSControlMessage (see <a
href="#LSControl-Token" title="LS Token Message">Section 4.5</a>) meant
to be passed around an LSRing to the various members in some order. There are
various criteria that can be used in deciding how to order the ring so that
everyone can predict where the token is, when they might expect to get it,
and whom they should get it from/ pass it to next. It is important that we
choose a sound method that is simple to calculate, and should use as much
"knowledge" of the ring as possible without burdening the LS instances too
much with complex calculations.</p>
+<p id="rfc.section.2.2.2.p.1">When scopes are created they form themselves
into logical groups around which tokens can be passed. These token passing
mechanism is used for two purposes, for registration control and for leader
election. A leader is necessary to circulate group updates, to start tokens
to initiate registration and to represent a given scope in an upper scope.</p>
+<p id="rfc.section.2.2.2.p.2">The "token" is an LSControlMessage (see <a
href="#LSControl-Token" title="LS Token Message">Section 3.5</a>) meant
to be passed around an LSRing to the various members in some order. There are
various criteria that can be used in deciding how to order the ring so that
everyone can predict where the token is, when they might expect to get it,
and whom they should get it from/ pass it to next. It is important that we
choose a sound method that is simple to calculate, and should use as much
"knowledge" of the ring as possible without burdening the LS instances too
much with complex calculations.</p>
<h4 id="rfc.section.2.2.2.1">
<a href="#rfc.section.2.2.2.1">2.2.2.1</a> <a id="leader_election"
href="#leader_election">Leader Election</a>
</h4>
-<p id="rfc.section.2.2.2.1.p.1">The essential idea in the token passing
mechanism for leader election is that some identifier is chosen for each node
and that the node with the highest (or lowest) identifier win the election
and becomes the leader. The basic mechanism of leader election is that
participants form a logical ring and initiate an election. An election can be
initiated when a new machine joins, at system start time, or when a host
feels that the leader may have failed based on failure to receive a periodic
token. When an election is initiated, the initiating host sends an election
message to its counter-clockwise neighbor and changes its state to
“ELECTING”. It places its identifier inside the message. The
ultimate goal is for the host with the highest identifier to be chosen. When
a host receives an election message, it compares its identifier with that in
the message. It forwards the higher of the identifiers. When a node receives
a message with its
own identifier, it knows that it has been selected and the election
terminates.</p>
-<p id="rfc.section.2.2.2.1.p.2">The next question is how to choose the
identifier for a given node. There still needs to be some discussion here.
The first proposal was to use the IP address of the node as the lower-order
32-bits of a 64-bit number and to allow the higher-order bits to be set as a
"priority" field. This would effectively allow a system administrator to make
sure that her most powerful or well-connected nodes became the leader when
they were available. In the absence of a priority, the nodes essentially are
randomly ordered.</p>
-<p id="rfc.section.2.2.2.1.p.3">The Vice-leader will be elected via the same
mechanism, initiated by the current leader, with the current leader
excluded.</p>
-<p id="rfc.section.2.2.2.1.p.4">The Leader and Vice-Leader LS instances
should exchange messages (see <a href="#LSControl-Leader" title="LS Leader
Message">Section 4.8</a>) periodically to ensure that in the event of a
failure the lower level will still have a link to the upper level. A
Vice-Leader will be monitoring the time between successive communications
from the Leader to be sure it has not failed. In the event that it has, the
"Join" procedure will start to the upper level to keep the hierarchy
complete.</p>
-<p id="rfc.section.2.2.2.1.p.5">Token-based election occurs when the group
membership changes. A node must initiate leader election if it doesn't
receive a token in the target token rotation time. As the identifiers are
deterministic, multiple nodes may initiate election at the same time with the
same result.</p>
+<p id="rfc.section.2.2.2.1.p.1">The essential idea in the token passing
mechanism for leader election is that some identifier is chosen for each node
and that the node with the highest identifier win the election and becomes
the leader. The basic mechanism of leader election is that participants form
a logical ring and initiate an election. An election should be initiated when
a new machine joins, at system start time, and when any host feels that the
leader may have failed based on failure to receive a periodic token. When an
election is initiated, the initiating host sends an election message to a
neighbor (as specified in the token order). The ultimate goal is for the host
with the highest identifier to be chosen. When a host receives an election
message, it compares its identifier with that in the message. It forwards the
higher of the identifiers. When a node receives a message with its own
identifier, it knows that it has been selected and the election
terminates.</p>
+<p id="rfc.section.2.2.2.1.p.2">The next question is how to choose the
identifier for a given node. Using the IP address of the node as the
lower-order 32-bits of a 64-bit number and allowing the higher-order bits to
be set as a "priority" field is a simple solution to this problem. This would
effectively allow a system administrator to make sure that her most powerful
or well-connected nodes became the leader when they were available. In the
absence of a priority, the nodes essentially are randomly ordered.</p>
+<p id="rfc.section.2.2.2.1.p.3">The Vice-leader will be elected via the same
mechanism, initiated by the current leader, with the current leader excluded.
The Leader and Vice-Leader LS instances should exchange messages (see <a
href="#LSControl-Leader" title="LS Leader Message">Section 3.8</a>)
periodically to ensure that in the event of a failure the lower level will
still have a link to the upper level. A Vice-Leader will be monitoring the
time between successive communications from the Leader to be sure it has not
failed. In the event that it has, the "Join" procedure will start to the
upper level to keep the hierarchy complete.</p>
+<p id="rfc.section.2.2.2.1.p.4">Token-based election occurs when the group
membership changes. A node must initiate leader election if it doesn't
receive a token in the target token rotation time. As the identifiers are
deterministic, multiple nodes may initiate election at the same time with the
same result.</p>
<div id="leader-election-example">
</div>
<div id="rfc.figure.4">
@@ -555,21 +551,20 @@
</pre>
<p>LS1, LS2 and LS3 are members of the ring. LS2 initiates Leader
Election</p>
<p class="figure">Figure 4</p>
-<p id="rfc.section.2.2.2.1.p.7">
+<p id="rfc.section.2.2.2.1.p.6">
</p>
<dl class="empty">
<dd>1. LS2 decides to initiate election,</dd>
-<dd>2. LS2 changes its state to ELECTING</dd>
-<dd>3. LS2 sends election message (with its identifier) to LS3</dd>
-<dd>4. LS3 receives election message with identifier of LS1. Its own
identifier is higher, so it sends election message to next peer LS1 with ist
own identifier.</dd>
-<dd>5. LS1 receives election message with identifier of LS3. Its own
identifier is lower, so it sends election message to next peer LS2 with
identifier of LS3.</dd>
-<dd>6. LS2 receives election message with identifier of LS3, election
finishes. LS2 knows the leader is LS3. LS2 disable ELECTING state.</dd>
+<dd>2. LS2 sends election message (with its identifier) to LS3</dd>
+<dd>3. LS3 receives election message with an identifier. Its own identifier
is higher, so it sends election message to next peer LS1 replacing the
identifier in the original with it's own.</dd>
+<dd>4. LS1 receives election message with an identifier. Its own identifier
is lower, so it sends election message unchanged to next peer LS2.</dd>
+<dd>5. LS2 receives election message with identifier, since it's own
Identifier is lower, it sends election message to next peer LS3.</dd>
+<dd>6. LS3 receives election message its own identifier, election finishes.
LS3 knows it is the leader.</dd>
</dl>
-<p id="rfc.section.2.2.2.1.p.8">Vice-leader election may be done using the
same algorithm. Then the election message should contain two identifiers:
Leader ID (the highest identifier) and Vice-Leader ID (the second highest
identifier).</p>
-<p id="rfc.section.2.2.2.1.p.9">======== MG: as far as I understand, all
members of the ring initiate own election, the result is always the same
(deterministic). If a peer initiates election and receives election message
with own identifier, it means that it is the new leader and should send new
token, right? Maybe after election, the peer that wasn't elected, should
inform the new leader? Or maybe another message passing over the ring is
required (this could be done with two states ELECTING and POST-ELECTION or
whatever).</p>
+<p id="rfc.section.2.2.2.1.p.7">Vice-leader election will then be started by
the new leader, starting with a "0" for the Identifier and stopping the
election once it sees the token once again. The leader will then communicate
backup information with this vice leader. It is not important for the rest of
the ring to know who the vice leader may be.</p>
<h4 id="rfc.section.2.2.2.2">
<a href="#rfc.section.2.2.2.2">2.2.2.2</a> Token Passing for
Registration Control</h4>
-<p id="rfc.section.2.2.2.2.p.1">The token can be viewed as "permission to
talk" and permits the holding LS to send its summary information to all other
available LS instances (see <a href="#LSControl-Summary-lower" title="LS
Summary Message (Lower)">Section 4.6</a> and <a
href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section 4.7</a>). The responses will be parsed to get any
useful updated information about current dLS cloud state.</p>
+<p id="rfc.section.2.2.2.2.p.1">The token can be viewed as "permission to
talk" and permits the holding LS to send its summary information to all other
available LS instances (see <a href="#LSControl-Summary-lower" title="LS
Summary Message (Lower)">Section 3.6</a> and <a
href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section 3.7</a>). The responses will be parsed to get any
useful updated information about current dLS cloud state.</p>
<h5 id="rfc.section.2.2.2.2.1">
<a href="#rfc.section.2.2.2.2.1">2.2.2.2.1</a> <a
id="token_passing_algorithm" href="#token_passing_algorithm">Token Passing
Algorithm</a>
</h5>
@@ -602,30 +597,27 @@
</p>
<dl class="empty">
<dd>1. LS1 receives the token i.e. LSControlRequest message with the
http://perfsonar.net/services/LS/token/ eventType from its predecessor
L3.</dd>
-<dd>2. LS1 updates its 'lower' peer list based on token content. The local
peer list is replaced by the one received in token</dd>
-<dd>3. LS1 sends LSControlRequest message with the
http://perfsonar.net/services/LS/summary/ eventType to all peers in the lease
(excluding itself).</dd>
+<dd>2. LS1 updates its peer list based on token content. The peer list is
replaced by the one received in token</dd>
+<dd>3. LS1 sends LSControlRequest message with the
http://perfsonar.net/services/LS/summary/ eventType to all peers in the list
(excluding itself).</dd>
<dd>4. LS2,LS3 receiving this message checks its collection and updates it
if necessary with service info.</dd>
-<dd>5. LS1 waits for some amount of time. (TO BE DEFINED - who decides
it?)</dd>
-<dd>6. LS1 sends token to next LS (LS2) from the LSRing lower scope. If it
fails, mark the not-responding peer as "not active" and try next one. (TO BE
DISCUSSED whether "not active" is just boolean or number of fails - after 3
failures the url will be removed from LSRing)</dd>
+<dd>5. LS1 waits for some time (see <a href="#rotation-time-computing"
title="Token rotation time computing">Section 2.2.2.2.2</a>)</dd>
+<dd>6. LS1 sends token to next LS (LS2) from the LSRing lower scope. If it
fails, mark the not-responding peer as "not active" and try next one.</dd>
</dl>
-<p id="rfc.section.2.2.2.2.1.p.4">MG: open issues:</p>
-<p id="rfc.section.2.2.2.2.1.p.5">- how to determine and remove duplicate
tokens?</p>
-<p id="rfc.section.2.2.2.2.1.p.6">- when to re-send token (I guess when
computed token rotation time passes)</p>
-<p id="rfc.section.2.2.2.2.1.p.7">- after leader election how the node can
know who is the leader and which tokens accept or reject (if there are tokens
sent by old leader and new leader)</p>
+<p id="rfc.section.2.2.2.2.1.p.4">Each note in the local group is
responsible for monitoring the state of the token. If the internal timer (see
<a href="#rotation-time-computing" title="Token rotation time
computing">Section 2.2.2.2.2</a>) expires without seeing a token a new
token should be generated. If a token is seen too soon (see <a
href="#rotation-time-computing" title="Token rotation time
computing">Section 2.2.2.2.2</a>) it should be dropped. This will ensure
that too many tokens do not enter into the ring at a given time.</p>
<h5 id="rfc.section.2.2.2.2.2">
<a href="#rfc.section.2.2.2.2.2">2.2.2.2.2</a> <a
id="rotation-time-computing" href="#rotation-time-computing">Token rotation
time computing</a>
</h5>
-<p id="rfc.section.2.2.2.2.2.p.1">The token rotation time is the time of
passing and serving token by all nodes in the LS ring. This time should be
computed by the leader basing on some knowledge about the time of serving
token by all particular nodes. The time may be based on times saving in token
message by all nodes. Initially, this will be very simple and will be
conputed as "2 minutes plus 5 seconds times the number of nodes in the
ring."</p>
-<p id="rfc.section.2.2.2.2.2.p.2">The key is that after the timeout has
exceeded, it can be inferred that the leader has failed and another election
should be initiated.</p>
+<p id="rfc.section.2.2.2.2.2.p.1">The token rotation time is the time of
passing and serving token by all nodes in the LS ring. This time should be
computed by all nodes basing on some knowledge about the time of serving
token by all particular nodes. The time may be based on times saving in token
message by all nodes. Initially, this will be very simple and will be
computed as "2 minutes plus 5 seconds times the number of nodes in the
ring."</p>
+<p id="rfc.section.2.2.2.2.2.p.2">The key is that after the timeout has
exceeded, it can be inferred that the leader has failed and another election
should be initiated. Conversely if a token is seen too early (less than half
the calculated time) the token should be dropped.</p>
<h3 id="rfc.section.2.2.3">
<a href="#rfc.section.2.2.3">2.2.3</a> <a id="summary_messages"
href="#summary_messages">Summarization Messages</a>
</h3>
<p id="rfc.section.2.2.3.p.1">
-<a href="#LSControl-Summary-lower" title="LS Summary Message
(Lower)">Section 4.6</a> and <a href="#LSControl-Summary-upper"
title="LS Summary Message (Upper)">Section 4.7</a> contain examples of
the message format for this exchange. It is left up to the implementation
when the summarization occurs (i.e. at message send time, or also as a
periodic event).</p>
+<a href="#LSControl-Summary-lower" title="LS Summary Message
(Lower)">Section 3.6</a> and <a href="#LSControl-Summary-upper"
title="LS Summary Message (Upper)">Section 3.7</a> contain examples of
the message format for this exchange. It is left up to the implementation
when the summarization occurs (i.e. at message send time, or also as a
periodic event).</p>
<h2 id="rfc.section.2.3">
<a href="#rfc.section.2.3">2.3</a> <a id="summary"
href="#summary">Summarization</a>
</h2>
-<p id="rfc.section.2.3.p.1">The LS that a service contacts to register
becomes the "Home LS" (HLS, see <a href="#glossary"
title="Glossary">Section 6.1</a>) of that particular service. It is the
responsibility of the HLS to make summary data about the all of the pS
services it knows of available to the larger enterprise and to draw relevant
queries to itself.</p>
+<p id="rfc.section.2.3.p.1">It is the responsibility of the HLS to make
summary data about the all of the pS services it knows of available to the
larger enterprise and to draw relevant queries to itself.</p>
<p id="rfc.section.2.3.p.2">Summarization is important to the overall
success of this service as summaries prevents other LS instances from being
overloaded by information. They must be general enough to allow for easy
creation and exchange but also must retain enough information to provide a
rich query interface able to locate the distributed information. That means
service metadata information must be reduced (summarized) as it propagates
through the LS cloud.</p>
<p id="rfc.section.2.3.p.3">We start by making an observation that
summarization is best based on scope (see also <a href="#scope" title="Scope
Formation">Section 2.2</a> for forming scope). Simply put, this means
that we should attempt to summarize "more" the "farther" away from the source
that we get. This creates a smaller data set that travels the farthest away
while keeping the larger and more informative data sets closer to the source.
We present the strategies as such:</p>
<p id="rfc.section.2.3.p.4">
@@ -640,8 +632,8 @@
</h3>
<p id="rfc.section.2.3.1.p.1">The lower scope summarization, described here
as information exchange between HLS instances internal to a domain, consists
of simply extracting detailed information from the metadata descriptions
provided by registered services. For now we define this to be simply removing
additional "parameter" elements from the metadata. Special consideration must
be given to the "supportedEventType" parameter by simply converting this to
actual eventType elements. This will ensure interoperability with legacy
services.</p>
<p id="rfc.section.2.3.1.p.2">Future iterations may choose to drop
additional pieces of information deemed unnecessary or private such as parts
of topological descriptions. This sort of modification is encouraged as long
as the data remains "symmetrical" and conforms to the schematic definitions
for a given metadata description. It should be noted that such modifications
will affect the searching procedure and could isolate the source services.</p>
-<p id="rfc.section.2.3.1.p.3">The mechanics for performing this level of
summarization can use any number of technologies. Either Extensible
Stylesheet Language Transformation (XSLT) documents or the XQuery language
(see <a href="#glossary" title="Glossary">Section 6.1</a>) may be used
to prepare the initial data for exchange in this first level. Since the
exchange of this local information will occur frequently, a simple operation
that is scheduled or on demand should be employed by the individual
implementations to ensure the regular LS functions are not impeded.</p>
-<p id="rfc.section.2.3.1.p.4">In order to make information available to the
LS cloud, the HLS will advertise this summary information to other LS
instances to propagate the appropriate information. Information exchange will
be handled using a "taking turns" protocol such as token ring. The holder of
the token will then perform the information exchange to other known instances
(see <a href="#glossary" title="Glossary">Section 6.1</a>).</p>
+<p id="rfc.section.2.3.1.p.3">The mechanics for performing this level of
summarization can use any number of technologies. Either Extensible
Stylesheet Language Transformation (XSLT) documents or the XQuery language
(see <a href="#glossary" title="Glossary">Section 4.1</a>) may be used
to prepare the initial data for exchange in this first level. Since the
exchange of this local information will occur frequently, a simple operation
that is scheduled or on demand should be employed by the individual
implementations to ensure the regular LS functions are not impeded.</p>
+<p id="rfc.section.2.3.1.p.4">In order to make information available to the
LS cloud, the HLS will advertise this summary information to other LS
instances to propagate the appropriate information. Information exchange will
be handled using a "taking turns" protocol such as token ring. The holder of
the token will then perform the information exchange to other known instances
(see <a href="#glossary" title="Glossary">Section 4.1</a>).</p>
<div id="hls-cloud">
</div>
<div id="rfc.figure.6">
@@ -682,14 +674,14 @@
</pre>
<p>The holder of the token (LS3) will inform everyone of its summary
information.</p>
<p class="figure">Figure 7</p>
-<p id="rfc.section.2.3.1.p.7">Once exchanged, the details regarding storage
in the XML database backend (see <a href="#glossary"
title="Glossary">Section 6.1</a>) are also left to individual
implementations. It is understood that this information, in the possession of
non HLS instances, is provided as a convenience and should be treated in the
same way that directly registered information is (i.e. purged on expiration).
When responding to queries for this information, the LS must indicate whether
or not it is authoritative.</p>
+<p id="rfc.section.2.3.1.p.7">Once exchanged, the details regarding storage
in the XML database backend (see <a href="#glossary"
title="Glossary">Section 4.1</a>) are also left to individual
implementations. It is understood that this information, in the possession of
non HLS instances, is provided as a convenience and should be treated in the
same way that directly registered information is (i.e. purged on expiration).
When responding to queries for this information, the LS must indicate whether
or not it is authoritative.</p>
<h3 id="rfc.section.2.3.2">
<a href="#rfc.section.2.3.2">2.3.2</a> <a
id="upper_scope_summarization" href="#upper_scope_summarization">Upper Scope
Summarization</a>
</h3>
<p id="rfc.section.2.3.2.p.1">A designated member of a given scope (often
corresponding to an organization) will be required to interact with other
similar LSs (generally representing other domains) in order to form a
higher-level, or "upper", scope. The mechanics of how we learn who is the
designated leader are discussed in <a href="#tokens" title="Token Messages
for Control and Election">Section 2.2.2</a>. The leader from each of the
first layers of this hierarchy (and the designated backup) will be
responsible for examining each member's summary information and building a
summarization/aggregation that represents the contents of the various LS
instances. This summary will serve as input to the upper scope.</p>
<p id="rfc.section.2.3.2.p.2">The most natural summarization is based on the
topology of the network (like in network routing). Thus, topology-based
summarization will reduce available service instances in the same way that IP
addresses are summarized into network numbers. They will indicate the
eventTypes that a service has and ones that can it can generate.
Summarization will be performed using specialized summary algorithm. Topology
information such as IP addresses will be summarized using algorithms based on
Radix Tree (see <a href="#IP-summary" title="IP Address Summarization
Algorithm">Section 2.3.2.1</a>).</p>
<p id="rfc.section.2.3.2.p.3">Other information can be summarized in an
easier manner through the use of either Extensible Stylesheet Language
Transformation (XSLT) documents or the XQuery language as discussed in the
previous section. These mechanisms will take into account the XML elements
that represent the network topology currently used in metadata subjects as
well as additional items such as eventTypes.</p>
-<p id="rfc.section.2.3.2.p.4">The output of this process becomes a "service
summary" that represents the breadth of the original input. This consists
minimally of IP networks and addresses and the eventTypes which are stored
and can be generated. See <a href="#LSControl-Summary-lower" title="LS
Summary Message (Lower)">Section 4.6</a> or <a
href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section 4.7</a> for a mock-up of the summary output. Additional
transformations, while aggressive, will strive to preserve as much
information as possible to remain useful during the search procedures.</p>
+<p id="rfc.section.2.3.2.p.4">The output of this process becomes a "service
summary" that represents the breadth of the original input. This consists
minimally of IP networks and addresses and the eventTypes which are stored
and can be generated. See <a href="#LSControl-Summary-lower" title="LS
Summary Message (Lower)">Section 3.6</a> or <a
href="#LSControl-Summary-upper" title="LS Summary Message
(Upper)">Section 3.7</a> for a mock-up of the summary output. Additional
transformations, while aggressive, will strive to preserve as much
information as possible to remain useful during the search procedures.</p>
<h4 id="rfc.section.2.3.2.1">
<a href="#rfc.section.2.3.2.1">2.3.2.1</a> <a id="IP-summary"
href="#IP-summary">IP Address Summarization Algorithm</a>
</h4>
@@ -702,7 +694,7 @@
<li>Insert: Like in most inserts we attempt to place something into the
structure. We first start by doing a Lookup to see if it exists already; the
point where we stop is where we will insert the object. We are careful to
utilize the longest matching prefix of any other nearby edge to our
advantage. This last part is normally referred to as "splitting" and ensures
that each node has no more than two children.</li>
<li>Delete: Delete an object from the tree. This operation will be
complicated by "collapsing" parents that have a single child and merging the
edges.</li>
</ul>
-<p id="rfc.section.2.3.2.1.p.4">Once constructed, it is possible to consult
the structure in creating IP network summaries. The current prototype
implementation of summarization creates a Radix tree of IPs during an update
phase. Then it can perform 2 types of summarization. First, the "maximum
dominator" of the Radix tree is the maximum summarzation for all IP addresses
in the Radix tree. Using the optimization mentioned above regarding strings
longer than 1 character or bit, the solution to the maximum dominator problem
is trivial -- it is simply the first node below the root. The second type of
summarization is to determine "K-dominators". Essesntially, for a given
target K, we produce the most appropriate summarizing nodes. While this
problem is NP-complete, we can construct an approximation heuristic that
simply considers the length of the strings in the internal (or structural)
nodes of the tree. We leave for future work the problem of "Min cost
dominators", in which t
he best K and the best K dominators are selected.</p>
+<p id="rfc.section.2.3.2.1.p.4">Once constructed, it is possible to consult
the structure in creating IP network summaries. The current prototype
implementation of summarization creates a Radix tree of IPs during an update
phase. Then it can perform 2 types of summarization. First, the "maximum
dominator" of the Radix tree is the maximum summarzation for all IP addresses
in the Radix tree. Using the optimization mentioned above regarding strings
longer than 1 character or bit, the solution to the maximum dominator problem
is trivial -- it is simply the first node below the root. The second type of
summarization is to determine "K-dominators". Essentially, for a given target
K, we produce the most appropriate summarizing nodes. While this problem is
NP-complete, we can construct an approximation heuristic that simply
considers the length of the strings in the internal (or structural) nodes of
the tree. We leave for future work the problem of "Min cost dominators", in
which th
e best K and the best K dominators are selected.</p>
<p id="rfc.section.2.3.2.1.p.5">Essentially, the output of the this
algorithm is a set of IP subnets and address expressed in Classless
Interdomain Routing (CIDR) style, i.e. W.X.Y.Z/mask bits. There may be times
when this address aggregation is manually specified and this is a completely
viable interim solution.</p>
<h2 id="rfc.section.2.4">
<a href="#rfc.section.2.4">2.4</a> <a id="search"
href="#search">Search</a>
@@ -712,7 +704,7 @@
<h3 id="rfc.section.2.4.1">
<a href="#rfc.section.2.4.1">2.4.1</a> <a id="discovery"
href="#discovery">Discovery Phase</a>
</h3>
-<p id="rfc.section.2.4.1.p.1">The discovery phase is used to locate the set
of Authoritative LS (or LSes) for a given Subject/eventType tuple. This
requires a query to be constructed over the Discovery information set (which
is not described yet, but which must consist of the 3-tuple of Subject
Summary, eventType and Authoritative LS.) Either a specific API call and a
pre-prepared query, or some automatic mechanism, must map the desired query
into a query of the Discovery info-set (see <a href="#LSControl-Discovery"
title="LS Discovery Message">Section 4.9</a>).</p>
+<p id="rfc.section.2.4.1.p.1">The discovery phase is used to locate the set
of Authoritative LS (or LSes) for a given Subject/eventType tuple. This
requires a query to be constructed over the Discovery information set (which
is not described yet, but which must consist of the 3-tuple of Subject
Summary, eventType and Authoritative LS.) Either a specific API call and a
pre-prepared query, or some automatic mechanism, must map the desired query
into a query of the Discovery info-set (see <a href="#LSControl-Discovery"
title="LS Discovery Message">Section 3.9</a>).</p>
<h4 id="rfc.section.2.4.1.1">
<a href="#rfc.section.2.4.1.1">2.4.1.1</a> <a id="discovery-alg"
href="#discovery-alg">Discovery Algorithm</a>
</h4>
@@ -742,21 +734,13 @@
<p id="rfc.section.2.4.2.p.2">Once we have found the HLS (or Home LSes) that
contain data in the range of our discovery query, we can pose Metadata
Queries to each of them. The results will be failure or success.</p>
<hr class="noprint">
<h1 id="rfc.section.3" class="np">
-<a href="#rfc.section.3">3.</a> <a id="bootstrapping"
href="#bootstrapping">Bootstrapping</a>
+<a href="#rfc.section.3">3.</a> <a id="structures-and-messages"
href="#structures-and-messages">Structures and Messages</a>
</h1>
-<p id="rfc.section.3.p.1">A distributed information system such as the LS
needs to address bootstrapping. In this system, an LS instance needs to find
other members of its scope (for each scope in which it participates.) To
accomplish this we will use a similar solution to what DNS uses
(root.hints).</p>
-<p id="rfc.section.3.p.2">We will maintain a service that maintains a list
of currently known LS instances. These known instances should preferably be
at the upper scope. All clients can cache this list. The service will be
accessed via a well-known hostname, and could be requested via UDP messages.
(We can also use TCP here for some sorts of anycast.)</p>
-<p id="rfc.section.3.p.3">Initially this will be deployed on one server. We
can extend this to handle redundancy and load balancing in the future by
using multiple DNS records and implementing ANYCAST with routing tricks for
this well known hostname. (Additionally, we can distribute an initial file
with a list of well known LS instances that are supported by the primary
perfSONAR participants.)</p>
-<p id="rfc.section.3.p.4">The above discovery algorithm is used to find an
LS within a given scope. Therefore, the only piece of information an LS
should need to be pre-configured with is the scope it belongs to. And as
stated above, that can be assumed to be "global:organization-dns-name". Note:
Need to define the specific syntax above.</p>
-<hr class="noprint">
-<h1 id="rfc.section.4" class="np">
-<a href="#rfc.section.4">4.</a> <a id="structures-and-messages"
href="#structures-and-messages">Structures and Messages</a>
-</h1>
-<h2 id="rfc.section.4.1">
-<a href="#rfc.section.4.1">4.1</a> <a id="service-metadata"
href="#service-metadata">Service metadata example</a>
+<h2 id="rfc.section.3.1">
+<a href="#rfc.section.3.1">3.1</a> <a id="service-metadata"
href="#service-metadata">Service metadata example</a>
</h2>
-<p id="rfc.section.4.1.p.1">Example of metadata describing information
collected and stored in Measurement Archive service</p>
-<p id="rfc.section.4.1.p.2">
+<p id="rfc.section.3.1.p.1">Example of metadata describing information
collected and stored in Measurement Archive service</p>
+<p id="rfc.section.3.1.p.2">
<pre>
<nmwg:metadata xmlns:nmwg="http://ggf.org/ns/nmwg/base/2.0/"
id="m_ale-netutil-1">
@@ -780,11 +764,11 @@
</pre>
</p>
-<h2 id="rfc.section.4.2">
-<a href="#rfc.section.4.2">4.2</a> <a id="lookup-info"
href="#lookup-info">Lookup Information</a>
+<h2 id="rfc.section.3.2">
+<a href="#rfc.section.3.2">3.2</a> <a id="lookup-info"
href="#lookup-info">Lookup Information</a>
</h2>
-<p id="rfc.section.4.2.p.1">Example Lookup Information of Measurement
Archive. The metadata block contains basic service information and data
elements containing the metadata from the MA.</p>
-<p id="rfc.section.4.2.p.2">
+<p id="rfc.section.3.2.p.1">Example Lookup Information of Measurement
Archive. The metadata block contains basic service information and data
elements containing the metadata from the MA.</p>
+<p id="rfc.section.3.2.p.2">
<pre>
<nmwg:metadata
id="http://newcastle.pc.cis.udel.edu:6767/perfSONAR_PS/services/snmpMA">
@@ -823,14 +807,14 @@
</pre>
</p>
-<h2 id="rfc.section.4.3">
-<a href="#rfc.section.4.3">4.3</a> <a id="LSRing" href="#LSRing">LS
Ring File Structure</a>
+<h2 id="rfc.section.3.3">
+<a href="#rfc.section.3.3">3.3</a> <a id="LSRing" href="#LSRing">LS
Ring File Structure</a>
</h2>
-<p id="rfc.section.4.3.p.1">The LSRing file represents the "state" of the LS
cloud at either level of hierarchy (we avoid using the terms "global" and
"local" here since the hierarchy may be much larger). This file must start
with some static values, and will be added to/deleted from as time goes on.
As such implementations must ensure that this file is under database control
of some sort.</p>
-<h3 id="rfc.section.4.3.1">
-<a href="#rfc.section.4.3.1">4.3.1</a> <a id="LSRingLower"
href="#LSRingLower">LS Ring lower level</a>
+<p id="rfc.section.3.3.p.1">The LSRing file represents the "state" of the LS
cloud at either level of hierarchy (we avoid using the terms "global" and
"local" here since the hierarchy may be much larger). This file must start
with some static values, and will be added to/deleted from as time goes on.
As such implementations must ensure that this file is under database control
of some sort.</p>
+<h3 id="rfc.section.3.3.1">
+<a href="#rfc.section.3.3.1">3.3.1</a> <a id="LSRingLower"
href="#LSRingLower">LS Ring lower level</a>
</h3>
-<p id="rfc.section.4.3.1.p.1">
+<p id="rfc.section.3.3.1.p.1">
<pre>
<nmwg:store type="LSRing-lower">
@@ -848,6 +832,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">0</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -861,7 +846,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -875,7 +861,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">1</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -889,7 +876,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -897,10 +885,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.3.2">
-<a href="#rfc.section.4.3.2">4.3.2</a> <a id="LSRingUpper"
href="#LSRingUpper">LS Ring upper level</a>
+<h3 id="rfc.section.3.3.2">
+<a href="#rfc.section.3.3.2">3.3.2</a> <a id="LSRingUpper"
href="#LSRingUpper">LS Ring upper level</a>
</h3>
-<p id="rfc.section.4.3.2.p.1">
+<p id="rfc.section.3.3.2.p.1">
<pre>
<nmwg:store type="LSRing-upper">
@@ -917,7 +905,8 @@
</psservice:service>
</perfsonar:subject>
<nmwg:parameters>
- <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -932,6 +921,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">1</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -939,15 +929,15 @@
</pre>
</p>
-<h2 id="rfc.section.4.4">
-<a href="#rfc.section.4.4">4.4</a> <a id="LSControl-Join"
href="#LSControl-Join">LS Joining Message for Joining a Ring</a>
+<h2 id="rfc.section.3.4">
+<a href="#rfc.section.3.4">3.4</a> <a id="LSControl-Join"
href="#LSControl-Join">LS Joining Message for Joining a Ring</a>
</h2>
-<p id="rfc.section.4.4.p.1">This message exchange represents when a "new" LS
instance comes online. The LS will send these messages to its "list" of known
LS instances until it gets a hit. The message consists of metadata/data
pair(s) that contain service information and a parameter indicating "size" of
the data set the LS manages. This will be used for leader voting purposes
later.</p>
-<p id="rfc.section.4.4.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is, and its "size" for voting
purposes. The data section is an enumeration of all of the current members of
the ring and their votes. This information gives the new member a snapshot of
the ring.</p>
-<h3 id="rfc.section.4.4.1">
-<a href="#rfc.section.4.4.1">4.4.1</a> <a id="LSControl-JoinRequest"
href="#LSControl-JoinRequest">Request</a>
+<p id="rfc.section.3.4.p.1">This message exchange represents when a "new" LS
instance comes online. The LS will send these messages to its "list" of known
LS instances until it gets a hit. The message consists of metadata/data
pair(s) that contain service information and a parameter indicating "size" of
the data set the LS manages. This will be used for leader voting purposes
later.</p>
+<p id="rfc.section.3.4.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is. The data section is an
enumeration of all of the current members of the ring and their votes. This
information gives the new member a snapshot of the ring.</p>
+<h3 id="rfc.section.3.4.1">
+<a href="#rfc.section.3.4.1">3.4.1</a> <a id="LSControl-JoinRequest"
href="#LSControl-JoinRequest">Request</a>
</h3>
-<p id="rfc.section.4.4.1.p.1">
+<p id="rfc.section.3.4.1.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -970,10 +960,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.4.2">
-<a href="#rfc.section.4.4.2">4.4.2</a> <a id="LSControl-JoinResponse"
href="#LSControl-JoinResponse">Response</a>
+<h3 id="rfc.section.3.4.2">
+<a href="#rfc.section.3.4.2">3.4.2</a> <a id="LSControl-JoinResponse"
href="#LSControl-JoinResponse">Response</a>
</h3>
-<p id="rfc.section.4.4.2.p.1">
+<p id="rfc.section.3.4.2.p.1">
<pre>
<nmwg:message type="LSControlResponse">
@@ -1018,15 +1008,15 @@
</pre>
</p>
-<h2 id="rfc.section.4.5">
-<a href="#rfc.section.4.5">4.5</a> <a id="LSControl-Token"
href="#LSControl-Token">LS Token Message</a>
+<h2 id="rfc.section.3.5">
+<a href="#rfc.section.3.5">3.5</a> <a id="LSControl-Token"
href="#LSControl-Token">LS Token Message</a>
</h2>
-<p id="rfc.section.4.5.p.1">This message exchange represents the token that
is passed between LS instances in a cloud. The message contains metadata/data
pair(s) wherein the Metadata is the sending LS's info, and the data contains
the contents of the LSRing file (lower or upper depending on the token we are
exchanging).</p>
-<p id="rfc.section.4.5.p.2">The response to this message should indicate
success or failure. Failure and timeouts should trigger a resend.</p>
-<h3 id="rfc.section.4.5.1">
-<a href="#rfc.section.4.5.1">4.5.1</a> <a id="LSControl-TokenRequest"
href="#LSControl-TokenRequest">Request</a>
+<p id="rfc.section.3.5.p.1">This message exchange represents the token that
is passed between LS instances in a cloud. The message contains metadata/data
pair(s) wherein the Metadata is the sending LS's info, and the data contains
the contents of the LSRing file (lower or upper depending on the token we are
exchanging).</p>
+<p id="rfc.section.3.5.p.2">The response to this message should indicate
success or failure. Failure and timeouts should trigger a resend.</p>
+<h3 id="rfc.section.3.5.1">
+<a href="#rfc.section.3.5.1">3.5.1</a> <a id="LSControl-TokenRequest"
href="#LSControl-TokenRequest">Request</a>
</h3>
-<p id="rfc.section.4.5.1.p.1">
+<p id="rfc.section.3.5.1.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -1056,6 +1046,7 @@
</perfsonar:subject>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -1073,10 +1064,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.5.2">
-<a href="#rfc.section.4.5.2">4.5.2</a> <a id="LSControl-TokenResponse"
href="#LSControl-TokenResponse">Response</a>
+<h3 id="rfc.section.3.5.2">
+<a href="#rfc.section.3.5.2">3.5.2</a> <a id="LSControl-TokenResponse"
href="#LSControl-TokenResponse">Response</a>
</h3>
-<p id="rfc.section.4.5.2.p.1">
+<p id="rfc.section.3.5.2.p.1">
<pre>
<nmwg:message type="LSControlResponse">
@@ -1093,6 +1084,7 @@
<nmwg:eventType>http://perfsonar.net/services/LS/token/lower/success</nmwg:eventType>
<nmwg:parameters>
<nmwg:parameter name="active">1</nmwg:parameter>
+ <nmwg:parameter name="leader">0</nmwg:parameter>
</nmwg:parameters>
</nmwg:metadata>
@@ -1104,21 +1096,21 @@
</pre>
</p>
-<h2 id="rfc.section.4.6">
-<a href="#rfc.section.4.6">4.6</a> <a id="LSControl-Summary-lower"
href="#LSControl-Summary-lower">LS Summary Message (Lower)</a>
+<h2 id="rfc.section.3.6">
+<a href="#rfc.section.3.6">3.6</a> <a id="LSControl-Summary-lower"
href="#LSControl-Summary-lower">LS Summary Message (Lower)</a>
</h2>
-<p id="rfc.section.4.6.p.1">This message exchange represents when an LS
instance is holding the token and sharing summary information (lower scope).
The message consists of metadata/data pair(s) that contain service
information and a parameter indicating "size" of the data set the LS manages
as well as the minimal (without parameters) summary.</p>
-<p id="rfc.section.4.6.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is, and its "size" for leader voting
purposes. The data section is message that can be used for logging.</p>
-<p id="rfc.section.4.6.p.3">When receiving the message, check your 'lower'
list and update it as needed for: </p>
+<p id="rfc.section.3.6.p.1">This message exchange represents when an LS
instance is holding the token and sharing summary information (lower scope).
The message consists of metadata/data pair(s) that contain service
information as well as the minimal (without parameters) summary.</p>
+<p id="rfc.section.3.6.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is. The data section is message that
can be used for logging.</p>
+<p id="rfc.section.3.6.p.3">When receiving the message, check your 'lower'
list and update it as needed for: </p>
<dl class="empty">
<dd>Do you know of this service? If so make sure the vote and other info is
ok.</dd>
<dd>Update the summary info in your collection</dd>
<dd>If you don"t know of them, add them!</dd>
</dl>
-<h3 id="rfc.section.4.6.1">
-<a href="#rfc.section.4.6.1">4.6.1</a> <a
id="LSControl-Summary2Request" href="#LSControl-Summary2Request">Request</a>
+<h3 id="rfc.section.3.6.1">
+<a href="#rfc.section.3.6.1">3.6.1</a> <a
id="LSControl-Summary2Request" href="#LSControl-Summary2Request">Request</a>
</h3>
-<p id="rfc.section.4.6.1.p.1">
+<p id="rfc.section.3.6.1.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -1152,10 +1144,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.6.2">
-<a href="#rfc.section.4.6.2">4.6.2</a> <a
id="LSControl-Summary2Response"
href="#LSControl-Summary2Response">Response</a>
+<h3 id="rfc.section.3.6.2">
+<a href="#rfc.section.3.6.2">3.6.2</a> <a
id="LSControl-Summary2Response"
href="#LSControl-Summary2Response">Response</a>
</h3>
-<p id="rfc.section.4.6.2.p.1">
+<p id="rfc.section.3.6.2.p.1">
<pre>
<nmwg:message type="LSControlResponse">
@@ -1181,21 +1173,21 @@
</pre>
</p>
-<h2 id="rfc.section.4.7">
-<a href="#rfc.section.4.7">4.7</a> <a id="LSControl-Summary-upper"
href="#LSControl-Summary-upper">LS Summary Message (Upper)</a>
+<h2 id="rfc.section.3.7">
+<a href="#rfc.section.3.7">3.7</a> <a id="LSControl-Summary-upper"
href="#LSControl-Summary-upper">LS Summary Message (Upper)</a>
</h2>
-<p id="rfc.section.4.7.p.1">This message exchange represents when an LS
instance is holding the token and sharing summary information. The message
consists of metadata/data pair(s) that contain service information and a
parameter indicating "size" of the data set the LS manages. The "data"
portion is the summary info (FORMAT TBD!!!)</p>
-<p id="rfc.section.4.7.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is, and its "size" for leader voting
purposes. The data section is message that can be used for logging.</p>
-<p id="rfc.section.4.7.p.3">When receiving the message, check your 'lower'
list and update it as needed for: </p>
+<p id="rfc.section.3.7.p.1">This message exchange represents when an LS
instance is holding the token and sharing summary information. The message
consists of metadata/data pair(s) that contain service information. The
"data" portion is the summary info.</p>
+<p id="rfc.section.3.7.p.2">The response message should indicate success or
failure via the eventType, and will contain metadata/data pair(s). The
metadata should indicate who the service is. The data section is message that
can be used for logging.</p>
+<p id="rfc.section.3.7.p.3">When receiving the message, check your 'lower'
list and update it as needed for: </p>
<dl class="empty">
<dd>Do you know of this service? If so make sure the vote and other info is
ok.</dd>
<dd>Update the summary info in your collection</dd>
<dd>If you don't know of them, add them!</dd>
</dl>
-<h3 id="rfc.section.4.7.1">
-<a href="#rfc.section.4.7.1">4.7.1</a> <a id="LSControl-SummaryRequest"
href="#LSControl-SummaryRequest">Request</a>
+<h3 id="rfc.section.3.7.1">
+<a href="#rfc.section.3.7.1">3.7.1</a> <a id="LSControl-SummaryRequest"
href="#LSControl-SummaryRequest">Request</a>
</h3>
-<p id="rfc.section.4.7.1.p.1">
+<p id="rfc.section.3.7.1.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -1217,8 +1209,6 @@
<summary:subject
xmlns:summary="http://ggf.org/ns/nmwg/summary/2.0/">
<nmtl3:network>
<nmtl3:ipAddress>128.4.10.0/16</nmtl3:ipAddress>
- <!-- Optional ASN -->
- <nmtl3:asn>666</nmtl3:asn>
</nmtl3:network>
</summary:subject>
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
@@ -1226,28 +1216,16 @@
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
</nmwg:metadata>
- <nmwg:metadata>
- <nmtopo:subject>
- <nmtopo:node>
- <nmtopo:location>
- <nmtopo:country>USA</nmtopo:country>
- </nmtopo:location>
- </nmtopo:node>
- </nmtopo:subject>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
- </nmwg:metadata>
</nmwg:data>
</nmwg:message>
</pre>
</p>
-<h3 id="rfc.section.4.7.2">
-<a href="#rfc.section.4.7.2">4.7.2</a> <a
id="LSControl-SummaryResponse" href="#LSControl-SummaryResponse">Response</a>
+<h3 id="rfc.section.3.7.2">
+<a href="#rfc.section.3.7.2">3.7.2</a> <a
id="LSControl-SummaryResponse" href="#LSControl-SummaryResponse">Response</a>
</h3>
-<p id="rfc.section.4.7.2.p.1">
+<p id="rfc.section.3.7.2.p.1">
<pre>
<nmwg:message type="LSControlResponse">
@@ -1273,15 +1251,15 @@
</pre>
</p>
-<h2 id="rfc.section.4.8">
-<a href="#rfc.section.4.8">4.8</a> <a id="LSControl-Leader"
href="#LSControl-Leader">LS Leader Message</a>
+<h2 id="rfc.section.3.8">
+<a href="#rfc.section.3.8">3.8</a> <a id="LSControl-Leader"
href="#LSControl-Leader">LS Leader Message</a>
</h2>
-<p id="rfc.section.4.8.p.1">This message exchange will be conducted between
the Leader and Vice-Leader on some (frequent) interval. It may even become a
part of the Leader's token exchange with the Upper Level.</p>
-<p id="rfc.section.4.8.p.2">The leader identifies itself, and sends down the
summaries from the upper level for the Vice-Leader to store. If the leader
should die, the vice leader will have a summary of the upper level and be
able to continue answering lower level queries and obtaining information from
the higher levels.</p>
-<h3 id="rfc.section.4.8.1">
-<a href="#rfc.section.4.8.1">4.8.1</a> <a id="LSControl-LeaderRequest"
href="#LSControl-LeaderRequest">Request</a>
+<p id="rfc.section.3.8.p.1">This message exchange will be conducted between
the Leader and Vice-Leader on some (frequent) interval. It may even become a
part of the Leader's token exchange with the Upper Level.</p>
+<p id="rfc.section.3.8.p.2">The leader identifies itself, and sends down the
summaries from the upper level for the Vice-Leader to store. If the leader
should die, the vice leader will have a summary of the upper level and be
able to continue answering lower level queries and obtaining information from
the higher levels.</p>
+<h3 id="rfc.section.3.8.1">
+<a href="#rfc.section.3.8.1">3.8.1</a> <a id="LSControl-LeaderRequest"
href="#LSControl-LeaderRequest">Request</a>
</h3>
-<p id="rfc.section.4.8.1.p.1">
+<p id="rfc.section.3.8.1.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -1320,28 +1298,14 @@
<nmwg:metadata>
<summary:subject
xmlns:summary="http://ggf.org/ns/nmwg/summary/2.0/">
<nmtl3:network>
- <nmtl3:subnet>128.4.10.0</nmtl3:subnet>
- <nmtl3:netmask>255.255.255.0</nmtl3:netmask>
- <nmtl3:asn>666</nmtl3:asn>
+ <nmtl3:ipAddress>128.4.10.0/16</nmtl3:ipAddress>
</nmtl3:network>
</summary:subject>
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
</nmwg:metadata>
-
- <nmwg:metadata>
- <nmtopo:subject>
- <nmtopo:node>
- <nmtopo:location>
- <nmtopo:country>USA</nmtopo:country>
- </nmtopo:location>
- </nmtopo:node>
- </nmtopo:subject>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/tools/snmp/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/utilization/2.0</nmwg:eventType>
-
<nmwg:eventType>http://ggf.org/ns/nmwg/characteristic/errors/2.0</nmwg:eventType>
- </nmwg:metadata>
+
</nmwg:data>
<!-- we could go on and on like that... -->
@@ -1352,10 +1316,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.8.2">
-<a href="#rfc.section.4.8.2">4.8.2</a> <a id="LSControl-LeaderResponse"
href="#LSControl-LeaderResponse">Response</a>
+<h3 id="rfc.section.3.8.2">
+<a href="#rfc.section.3.8.2">3.8.2</a> <a id="LSControl-LeaderResponse"
href="#LSControl-LeaderResponse">Response</a>
</h3>
-<p id="rfc.section.4.8.2.p.1">
+<p id="rfc.section.3.8.2.p.1">
<pre>
<nmwg:message type="LSControlRequest">
@@ -1366,14 +1330,14 @@
</pre>
</p>
-<h2 id="rfc.section.4.9">
-<a href="#rfc.section.4.9">4.9</a> <a id="LSControl-Discovery"
href="#LSControl-Discovery">LS Discovery Message</a>
+<h2 id="rfc.section.3.9">
+<a href="#rfc.section.3.9">3.9</a> <a id="LSControl-Discovery"
href="#LSControl-Discovery">LS Discovery Message</a>
</h2>
-<p id="rfc.section.4.9.p.1">Structure of the LSDiscovery Message used to
locate info-sets. (FORMAT TBD!!!)</p>
-<h3 id="rfc.section.4.9.1">
-<a href="#rfc.section.4.9.1">4.9.1</a> <a id="LSDiscoveryRequest"
href="#LSDiscoveryRequest">Request</a>
+<p id="rfc.section.3.9.p.1">Structure of the LSDiscovery Message used to
locate info-sets.</p>
+<h3 id="rfc.section.3.9.1">
+<a href="#rfc.section.3.9.1">3.9.1</a> <a id="LSDiscoveryRequest"
href="#LSDiscoveryRequest">Request</a>
</h3>
-<p id="rfc.section.4.9.1.p.1">
+<p id="rfc.section.3.9.1.p.1">
<pre>
<nmwg:message type="LSDiscoveryRequest">
@@ -1390,10 +1354,10 @@
</pre>
</p>
-<h3 id="rfc.section.4.9.2">
-<a href="#rfc.section.4.9.2">4.9.2</a> <a id="LSDiscoveryResponse"
href="#LSDiscoveryResponse">Response</a>
+<h3 id="rfc.section.3.9.2">
+<a href="#rfc.section.3.9.2">3.9.2</a> <a id="LSDiscoveryResponse"
href="#LSDiscoveryResponse">Response</a>
</h3>
-<p id="rfc.section.4.9.2.p.1">
+<p id="rfc.section.3.9.2.p.1">
<pre>
<nmwg:message type="LSDiscoveryResponse">
@@ -1431,20 +1395,11 @@
</pre>
</p>
<hr class="noprint">
-<h1 id="rfc.section.5" class="np">
-<a href="#rfc.section.5">5.</a> <a id="codes" href="#codes">Result
codes</a>
+<h1 id="rfc.section.4" class="np">
+<a href="#rfc.section.4">4.</a> <a id="apdx" href="#apdx">Appendices</a>
</h1>
-<ul>
-<li>error.ls.foo -</li>
-<li>success.ls.foo -</li>
-<li>TBD</li>
-</ul>
-<hr class="noprint">
-<h1 id="rfc.section.6" class="np">
-<a href="#rfc.section.6">6.</a> <a id="apdx" href="#apdx">Appendices</a>
-</h1>
-<h2 id="rfc.section.6.1">
-<a href="#rfc.section.6.1">6.1</a> <a id="glossary"
href="#glossary">Glossary</a>
+<h2 id="rfc.section.4.1">
+<a href="#rfc.section.4.1">4.1</a> <a id="glossary"
href="#glossary">Glossary</a>
</h2>
<ul>
<li>AuthoritativeLS - LS that is an authority for the perfSONAR services in
question. AuthoritativeLS is a result of discovery phase and can be used in
the metadata query phase.</li>
@@ -1466,7 +1421,7 @@
<li>XQuery - A query language (with some programming language features) that
is designed to query collections of XML data. It is semantically similar to
SQL.</li>
</ul>
<h1 class="np" id="rfc.references">
-<a href="#rfc.section.7">7.</a> References</h1>
+<a href="#rfc.section.5">5.</a> References</h1>
<table summary="References" border="0" cellpadding="2">
<tr>
<td class="topnowrap">
Modified: trunk/nmwg/doc/dLS/dLS.pdf
===================================================================
(Binary files differ)
Modified: trunk/nmwg/doc/dLS/dLS.xml
===================================================================
--- trunk/nmwg/doc/dLS/dLS.xml 2007-12-14 11:12:29 UTC (rev 308)
+++ trunk/nmwg/doc/dLS/dLS.xml 2007-12-14 12:22:22 UTC (rev 309)
@@ -78,36 +78,31 @@
<t>
<list style="symbols">
+ <t>Scope - to enable a hierarchy of systems, some form of scoping
+ must exist that defines local and remote communication groups.
Issues
+ regarding initial discovery, "boostrapping", are discussed in the
+ context of domain-specific issues.
+ </t>
<t>Summarization - to reduce the amount of information sent over the
-
network or to anonymize sensitive data, some form of data
reduction
must take place.
</t>
- <t>Scope - to enable a hierarchy of systems, some form of scoping
- must exist that defines local and remote communication groups.
- </t>
<t>Search - information location is key and the way in which
distributed location and search is handled is the crux of this
service.
</t>
</list>
</t>
-
- <t>
- Additionally we present solutions to issues necessary to allow
effective
- operation of this service including bootstrapping (i.e. how service
finds
- other parts of the system) and domain-specific concerns.
- </t>
+
</section>
<section anchor="system" title="System Specific Operation">
-<!-- Overivew Section -->
+<!-- Overview Section -->
<section anchor="overview" title="Overview">
<t>The first step of information flow is when a pS service registers with
an LS. The service may know the name of an LS via static
- configuration (the most common case for legacy deployments), or
- other forms of bootstrapping such as multicast may occur. A
service
+ configuration (the most common case for legacy deployments) A
service
registers a "service metadata" record about itself and full
metadata (i.e. containing all information such as subject,
eventType(s), and any parameters, see
@@ -128,13 +123,12 @@
<t>The architecture of the dLS protocol assumes the existence of logical
groups of LS instances. The architecture should allow for
multiple
- levels of these rings representing multiple splits in a
hierarchy,
- although the basic example that will be an ongoing theme
in this document
- will revolve around only 2 levels. The authors realize it
is impossible
- to predict how the hierarchy of this service may split
over time,
- therefore we avoid using language that directly
categorizes a ring into
- a specific role. In general the two rings that define
scope are
- 'lower' and 'upper'.
+ levels representing splits in a hierarchy, although the
basic example
+ in this document will revolve around only 2 levels. The
authors
+ realize it is impossible to predict how the hierarchy of
this service
+ may split over time, therefore we avoid using language
that directly
+ categorizes a ring into a specific role. In general the
two levels
+ that define scope are "lower" and "upper".
</t>
<t>
@@ -144,13 +138,13 @@
goal of perfSONAR is to ease the detection of end to end
performance
issues particularly across domain boundaries, therefore
communication
between domain LS instances is paramount. We assume for
this example
- that the 'top' most level is that of the domain; further
fragmentation
- by other factors such as the 'top-level domain' or
geographical
+ that the "top" most level is that of the domain; further
fragmentation
+ by other factors such as the "top-level domain" or
geographical
considerations are probable, just not of interest in this
work. A
single domain may have multiple LS deployments; a
representative
- 'leader' from this set will represent the 'upper'
(intra-domain) scope
+ "leader" from this set will represent the "upper"
(intra-domain) scope
and communicate with similar LS instances of other domains
in this case.
- The actual registered services of the LS represent the
'lower' (local,
+ The actual registered services of the LS represent the
"lower" (local,
or in many cases inter-domain) scope.
</t>
@@ -158,11 +152,11 @@
The scoping designations are important to the next stage: data
reduction.
We observe that the abundance of information available via the
original
metadata description is rather obtuse when it comes to answering
a
- simple (and common) query such as 'give me bandwidth data for
host x'.
+ simple (and common) query such as "give me bandwidth data for
host x".
Although information such as capacity or interface name is
valuable internal
to a domain, it does not serve much purpose to NOC staff simply
asking to
- see utilization of a link. We propose a 'summarization'
strategy based
- on 'distance' from the source that will distill the complete
metadata into
+ see utilization of a link. We propose a "summarization"
strategy based
+ on "distance" from the source that will distill the complete
metadata into
smaller and smaller sets as the information is passed through
the scope
hierarchy.
</t>
@@ -171,28 +165,35 @@
Finally, using the scoping and summarizing steps we come to
final, and
arguably most important phase: search. Search must rely on two
phases
to work efficiently in the dLS framework, namely discovery and
query.
- The first step is locating 'where' information can be found.
This
+ The first step is locating "where" information can be found.
This
involves asking semi direct questions regarding the well defined
network
- topology in order to locate the 'vicinity' of data. The query
phase will
+ topology in order to locate the "vicinity" of data. The query
phase will
then ask more esoteric questions once it locates the proper LS
instances
to ask. The discovery phase is made possible through the
process of
summarization, while the query phase remains similar to the
current LS
functionality.
</t>
-l
+
</section>
+
+
+
+
+
<section anchor="scope" title="Scope Formation">
- <t>The next question is how to form the hierarchy of LS instances and
- subsequently organize the 'scopes'. The simplest answer is that the
- highest scope be formed based on the domain name of the participating
- systems as mentioned in the previous examples. That would allow e.g.
- internet2.edu, geant2.net, and pionier.gov.pl to potentially operate
more
- than one LS instance inside their own domains (for performance and
- scalability.) As LS instances come online they will invoke
- bootstrapping procedures to find and join a lower scoped group
- first.
+ <t>The first aspect of this system is how to form the hierarchy of LS
+ instances and subsequently organize the scopes. The simplest answer
+ is that the highest scope be formed based on the domain name of the
+ participating systems as mentioned in the previous examples. That
+ would allow e.g. internet2.edu, geant2.net, and pionier.gov.pl to
+ potentially operate more than one LS instance inside their own
domains
+ (for performance and scalability.) As LS instances come online they
+ will invoke bootstrapping procedures to find and join a lower scoped
+ group first. The LS that a service contacts to register becomes the
+ "Home LS" (HLS, see <xref target="glossary" />) of that particular
+ service.
</t>
<t>The scopes should be named based on URIs. This will allow a
domain-level scope to take the form
@@ -211,9 +212,10 @@
</t>
<t>
<list style="symbols">
- <t>Join Procedure</t>
- <t>Token Passing</t>
- <t>Summarization Notification</t>
+ <t><xref target="join" /> - Join Procedure</t>
+ <t><xref target="tokens" /> - Token Passing</t>
+ <t><xref target="summary_messages" /> - Summarization Notification</t>
+
</list>
</t>
<t>
@@ -226,48 +228,37 @@
<section anchor="join" title="Join Procedure">
<t>
- When an LS instance comes online it will have some bootstrapping
- knowledge of potential peers (both inter and intra domain). This
- information is contained in LSRing file (see
- <xref target="LSRing" />). The inter-domain knowledge is used
first
- to establish a connection to an already in progress ring, or
perhaps
- to start a ring that may not exist yet.
+ When an LS instance comes online it must have some knowledge of
+ potential peers (both inter and intra domain). This information
+ is contained in LSRing files (see <xref target="LSRing" />).
The
+ inter-domain knowledge (i.e. "lower") is used first to establish
a
+ connection to an already in progress ring, or perhaps to start a
+ ring that may not exist yet.
</t>
<t>
- A candidate LS will continuously search its LSRing information
and
- send an LSControl message to its known LS instances with a
"join" eventType
- (see <xref target="LSControl-Join" />) until a successful
response
- is seen. The LS candidate will then search through
- the successful LSControlResponse to this message and update its
LSRing with
- the returned information. This can mean updating the "active"
parameter
- as well as adding new LS instances. This parameter is
indicative of
- the "live-ness" (i.e. were we successful in contacting it
recently).
- The contacted LS will also update the local copy of LSRing to
add the
- new member to its "available" list.
+ A joining LS will continuously search the LSRing information and
+ send an LSControl message to known LS instances with a "join"
+ eventType (see <xref target="LSControl-Join" />) until a
successful
+ response is seen. The LS candidate will then search through
+ the successful LSControlResponse to this message and update its
LSRing
+ information with the returned information. This can mean
updating
+ the "active" and "leader" parameters as well as adding new LS
+ instances. The first parameter is indicative of the "live-ness"
+ (i.e. were we successful in contacting it recently), the second
is
+ used to indicate who the current "leader" of this group is.
These
+ dynamic variables will be constantly changing.
</t>
-
+
<t>
- For security purposes, it is necessary for the members of the LSRing
to know that a new member
- has joined without that member authenticating pairwise with each other
member of the ring. To
- accomplish this, the initially contacted LS will request that the
current ring leader initiate
- a token rotation to allow all members to update their LSRing list.
-
-
+ For security purposes, it is necessary for the members of the LSRing
to
+ know that a new member has joined without that member authenticating
+ pairwise with each other member of the ring. To accomplish this, the
+ initially contacted LS will request that the current ring leader
initiate
+ a token rotation to allow all members to update their LSRing files thus
+ allowing instant recognition of the new member.
</t>
+
<t>
- After updating, the newly joined LS will broadcast another
LSControl message with
- a "summary" eventType (see <xref
target="LSControl-Summary-lower" />,
- or if we are dealing with the upper level see
- <xref target="LSControl-Summary-upper" />) to all of the
"active"
- LS instances from its LSRing. Again the responses will be
parsed
- to get any useful updated information. At the end of this
process
- the joining LS will possess an LSRing file reflecting the state
of the
- dLS cloud. Each of the recipient LS instances which hasn't
heard
- anything from this joining LS previously will do the same,
- including adding this new member to their own lists (as they
- didn't know of it's existence yet).
- </t>
- <t>
After this initial warm-up the LS will observe the rules of token
etiquette and remain silent until it is contacted with a token,
or
it has not seen one in a very long time (see <xref
target="tokens" />).
@@ -298,14 +289,21 @@
<t>
<list type="symbols">
- <t>1. LS1 (candidate to the ring) sends join
(http://perfsonar.net/services/LS/join eventType) request to LS2 (member of
the ring). LS2 receives join message from LS1 and decides whether to accept
it or not. Application of security policy may occur here.</t>
- <t>2. LS2 rejects join request from LS1 and responses with proper
error code</t>
+ <t>1. LS1 (candidate to the ring) sends join
+ (http://perfsonar.net/services/LS/join eventType) request to LS2
+ (member of the ring). LS2 receives join message from LS1 and decides
+ whether to accept it or not. Application of security policy may
occur
+ here.</t>
+ <t>2. LS2 rejects join request from LS1 and responses with proper
+ error code</t>
</list>
</t>
+<!-- update this -->
+
<t>
<figure anchor="join-example-acc">
- <preamble>Illustration of LS Join Algorithm (rejected)</preamble>
+ <preamble>Illustration of LS Join Algorithm (accepted)</preamble>
<artwork>
|==========LS Ring=========|
@@ -342,32 +340,60 @@
</figure>
</t>
+<!-- change names of tokens (too many currently) -->
+
<t>
<list type="symbols">
- <t>1. LS1 (candidate to the ring) sends join
(http://perfsonar.net/services/LS/join eventType) request to LS2 (member of
the ring). LS2 receives join message from LS1 and decides whether to accept
it or not. Application of security policy may occur here.</t>
- <t>2. LS2 accepts join request from LS1 and responses with success
code and LSRing content. LS2 will be waiting for send-summary request
(http://perfsonar.net/services/LS/send-summary eventType) </t>
- <t>3. LS2 sends send-update-token
(http://perfsonar.net/services/LS/send-update-token eventType) to LS3 (the
leader of the ring). Send-update-token contain the URL of LS1. LS3 updates
its LSRing with URL of LS1.</t>
- <t>4. LS3 immediately sends update-token
(http://perfsonar.net/services/LS/update-token) to next peer from LSRing.
Update-token contains updated LSRing.</t>
- <t>5. LS2 receives update-token, updates its LSRing and immediately
sends update-token to the next peer</t>
- <t>6. After full cycle of update-token LS3 receives own
update-token. Now all ring members have knowledge about newly joined LS1 and
can accept summary from LS1.</t>
- <t>7. LS3 responses for request mentioned in step 3. LS2 receives an
acknowledgement (result code) of update-token operation.</t>
- <t>8. If update-token was accomplished succesfuly, LS2 sends
send-summary request (http://perfsonar.net/services/LS/send-summary
eventType) to LS1</t>
- <t>9. LS1 sends summary to all peers in the LSRing. Now all members
of the LSRing have the summary information from LS1.</t>
+ <t>1. LS1 (candidate to the ring) sends join
+ (http://perfsonar.net/services/LS/join eventType) request to LS2
+ (member of the ring). LS2 receives join message from LS1 and decides
+ whether to accept it or not. Application of security policy may
occur
+ here.</t>
+
+ <t>2. LS2 accepts join request from LS1 and responses with success
+ code and LSRing content.</t>
+
+ <t>3. LS2 sends send-update-token
+ (http://perfsonar.net/services/LS/send-update-token eventType) to
LS3
+ (the leader of the ring). Send-update-token contain the URL of LS1.
+ LS3 updates its LSRing with URL of LS1.</t>
+
+ <t>4. LS3 immediately sends update-token
+ (http://perfsonar.net/services/LS/update-token) to next peer from
+ LSRing. Update-token contains updated LSRing information. This will
be
+ exchanged between all peers. This message should not be marked
+ as a duplicate or dropped by any members of the ring.</t>
+
+ <t>5. LS3 sends a leader election token immediately after this to
+ trigger a new election cycle. This will also exchanged between all
+ peers.</t>
+
+ <t>6. After full cycle of update-token LS3 receives update-token
+ (identified by messageIdRef) and drops the token.
+ Now all ring members have knowledge about newly joined LS1.</t>
+
+ <t>7. After a full cycle of leader election, a new leader is known.
+ This new leader may start the backup leader election.</t>
+
+ <t>8. Regular token exchange and summarization notification will
+ resume in time.</t>
+
</list>
</t>
-
- <t>The algorithm could be simplified by moving response from step 2 to
step 8. However then, LS1 may be waiting for quite a long time without any
reponse and communication time may pass. Such a simplification should be
taken under consideration after testing.</t>
</section>
</section>
+
+
<section anchor="tokens" title="Token Messages for Control and Election">
- <t>When scopes are created they form themselves into logical rings
around which tokens can be
- passed. These token passing mechanism is used for two purposes, for
registration control and
- for leader election. A leader is necessary to circulate group
updates, to start tokens to
- initiate registration and to represent a given scope in an upper
scope.</t>
+ <t>When scopes are created they form themselves into logical groups
around
+ which tokens can be passed. These token passing mechanism is used for
+ two purposes, for registration control and for leader election. A
leader
+ is necessary to circulate group updates, to start tokens to initiate
+ registration and to represent a given scope in an upper scope.</t>
<t>
The "token" is an LSControlMessage (see <xref
target="LSControl-Token" />)
@@ -385,16 +411,15 @@
<t>
The essential idea in the token passing mechanism for leader election
is
that some identifier is chosen for each node and that the node with the
- highest (or lowest) identifier win the election and becomes the leader.
+ highest identifier win the election and becomes the leader.
The basic mechanism of leader election is that participants form
- a logical ring and initiate an election. An election can be
- initiated when a new machine joins, at system start time, or when
- a host feels that the leader may have failed based on failure to
+ a logical ring and initiate an election. An election should be
+ initiated when a new machine joins, at system start time, and when
+ any host feels that the leader may have failed based on failure to
receive a periodic token. When an election is initiated, the
- initiating host sends an election message to its
- counter-clockwise neighbor and changes its state to “ELECTING”.
- It places its identifier inside the message. The ultimate goal is
- for the host with the highest identifier to be chosen. When a
+ initiating host sends an election message to a
+ neighbor (as specified in the token order). The ultimate
+ goal is for the host with the highest identifier to be chosen. When a
host receives an election message, it compares its identifier
with that in the message. It forwards the higher of the
identifiers. When a node receives a message with its own
@@ -404,20 +429,17 @@
<t>
The next question is how to choose the identifier for a given node.
- There still needs to be some discussion here. The first proposal was
to
- use the IP address of the node as the lower-order 32-bits of a 64-bit
- number and to allow the higher-order bits to be set as a "priority"
- field. This would effectively allow a system administrator to make
sure
- that her most powerful or well-connected nodes became the leader when
- they were available. In the absence of a priority, the nodes
essentially
- are randomly ordered.
+ Using the IP address of the node as the lower-order 32-bits of a 64-bit
+ number and allowing the higher-order bits to be set as a "priority"
+ field is a simple solution to this problem. This would effectively
allow
+ a system administrator to make sure that her most powerful or
+ well-connected nodes became the leader when they were available. In
the
+ absence of a priority, the nodes essentially are randomly ordered.
</t>
<t>The Vice-leader will be elected via the same mechanism, initiated
by the current leader,
- with the current leader excluded.</t>
-
- <t>
- The Leader and Vice-Leader LS instances should exchange messages
+ with the current leader excluded. The Leader and Vice-Leader LS
instances
+ should exchange messages
(see <xref target="LSControl-Leader" />)
periodically to ensure that in the event of a failure the lower
level will still have a link to the upper level. A Vice-Leader
@@ -458,16 +480,32 @@
<t>
<list type="symbols">
<t>1. LS2 decides to initiate election, </t>
- <t>2. LS2 changes its state to ELECTING</t>
- <t>3. LS2 sends election message (with its identifier) to LS3</t>
- <t>4. LS3 receives election message with identifier of LS1. Its
own identifier is higher, so it sends election message to next peer LS1 with
ist own identifier.</t>
- <t>5. LS1 receives election message with identifier of LS3. Its
own identifier is lower, so it sends election message to next peer LS2 with
identifier of LS3.</t>
- <t>6. LS2 receives election message with identifier of LS3,
election finishes. LS2 knows the leader is LS3. LS2 disable ELECTING
state.</t>
+ <t>2. LS2 sends election message (with its identifier) to LS3</t>
+ <t>3. LS3 receives election message with an identifier. Its own
+ identifier is higher, so it sends election message to next peer
LS1
+ replacing the identifier in the original with it's own.</t>
+ <t>4. LS1 receives election message with an identifier. Its own
+ identifier is lower, so it sends election message unchanged to
next
+ peer LS2.</t>
+
+ <t>5. LS2 receives election message with identifier, since it's
+ own Identifier is lower, it sends election message to next peer
LS3.
+ </t>
+
+ <t>6. LS3 receives election message its own identifier, election
+ finishes. LS3 knows it is the leader.</t>
</list>
</t>
- <t>Vice-leader election may be done using the same algorithm. Then
the election message should contain two identifiers: Leader ID (the highest
identifier) and Vice-Leader ID (the second highest identifier).</t>
-
- <t>======== MG: as far as I understand, all members of the ring
initiate own election, the result is always the same (deterministic). If a
peer initiates election and receives election message with own identifier, it
means that it is the new leader and should send new token, right? Maybe after
election, the peer that wasn't elected, should inform the new leader? Or
maybe another message passing over the ring is required (this could be done
with two states ELECTING and POST-ELECTION or whatever).</t>
+ <t>Vice-leader election will then be started by the new leader,
starting
+ with a "0" for the Identifier and stopping the election once it sees
the
+ token once again. The leader will then communicate backup information
+ with this vice leader. It is not important for the rest of the ring
+ to know who the vice leader may be.</t>
+
+<!--
+JZ: Cleaned up explanation to better reflect reality.
+-->
+
</section>
<section title="Token Passing for Registration Control">
@@ -479,36 +517,9 @@
will be parsed to get any useful updated information about
current
dLS cloud state.
</t>
-
- <!-- this is wrong, I think.
- <t>
- The holder of the token, after completing summarization, will
wait
- some pre-determined amount of time before sending the token to
the
- next LS instance. In general the LS instances should not be
overly
- sensitive to the progression of the token. If each LS instance
is
- monitoring the progress, and for some reason we have lost the
- token it may start a flurry of retransmits and drops that will
take
- cycles to calm down again. Thus we leave the decisions
regarding
- tokens up to a single node, namely the designated leader of a
scope.
- We build functionality into leader nodes to be the only "maker"
and "executioner" of
- tokens. Extra tokens are dropped/created only by a single node
in
- the ring. All strange thrashing behavior is avoided and if
- something bad happens it is eliminated in a single passing. The
- leader node will have knowledge of the size of the ring (even if
- the ring has grown our join algorithm will inform all interested
- parties instantly) and the token "wait" period (should be a
- standard value) thus calculating the expected time is not an
issue.
- </t>
- -->
-
-
-
+
<section anchor="token_passing_algorithm" title="Token Passing
Algorithm">
-<!--
- <section anchor="tokenpass_algorithm" title="Toking Passing
Algorithm">
-
- <t>
--->
+
<figure anchor="token-passing-example">
<preamble>Illustration of Token Passing Algorithm</preamble>
@@ -540,39 +551,46 @@
<t>1. LS1 receives the token i.e. LSControlRequest message
with the http://perfsonar.net/services/LS/token/
eventType from its predecessor L3.</t>
- <t>2. LS1 updates its 'lower' peer list based on token content. The
local peer list is replaced by the one received in token</t>
+ <t>2. LS1 updates its peer list based on token content. The
+ peer list is replaced by the one received in token</t>
<t>3. LS1 sends LSControlRequest message with the
http://perfsonar.net/services/LS/summary/ eventType
- to all peers in the lease (excluding itself).</t>
+ to all peers in the list (excluding itself).</t>
<t>4. LS2,LS3 receiving this message checks its collection and
updates it
if necessary with service info.</t>
- <t>5. LS1 waits for some amount of time. (TO BE DEFINED - who decides
it?)</t>
+ <t>5. LS1 waits for some time (see <xref
target="rotation-time-computing" />)</t>
<t>6. LS1 sends token to next LS (LS2) from the LSRing lower scope.
- If it fails, mark the not-responding peer as "not active" and
try next one. (TO BE DISCUSSED whether "not active" is just boolean or number
of fails - after 3 failures the url will be removed from LSRing)
+ If it fails, mark the not-responding peer as "not active" and
try
+ next one.
</t>
</list>
</t>
+ <t>
+ Each note in the local group is responsible for monitoring the state
+ of the token. If the internal timer (see
+ <xref target="rotation-time-computing" />) expires without seeing a
token
+ a new token should be generated. If a token is seen too soon (see
+ <xref target="rotation-time-computing" />) it should be dropped. This
+ will ensure that too many tokens do not enter into the ring at a given
+ time.
+ </t>
- <t>MG: open issues:</t>
- <t>- how to determine and remove duplicate tokens?</t>
- <t>- when to re-send token (I guess when computed token rotation time
passes)</t>
- <t>- after leader election how the node can know who is the leader
and which tokens accept or reject (if there are tokens sent by old leader and
new leader)</t>
-
-
</section>
<section anchor="rotation-time-computing" title="Token rotation time
computing">
<t>The token rotation time is the time of passing and serving token
by all
- nodes in the LS ring. This time should be computed by the leader
basing on
+ nodes in the LS ring. This time should be computed by all nodes
basing on
some knowledge about the time of serving token by all particular
nodes.
The time may be based on times saving in token message by all nodes.
- Initially, this will be very simple and will be conputed as "2
minutes plus 5 seconds times
- the number of nodes in the ring."
+ Initially, this will be very simple and will be computed as "2
minutes
+ plus 5 seconds times the number of nodes in the ring."
</t>
<t>
The key is that after the timeout has exceeded, it can be inferred
that
- the leader has failed and another election should be initiated.
+ the leader has failed and another election should be initiated.
+ Conversely if a token is seen too early (less than half the calculated
+ time) the token should be dropped.
</t>
</section>
@@ -598,8 +616,6 @@
<!-- Summarization Section -->
<section anchor="summary" title="Summarization">
<t>
- The LS that a service contacts to register becomes the "Home LS"
- (HLS, see <xref target="glossary" />) of that particular service.
It is the responsibility of the HLS to make summary data about the
all of the pS services it knows of available to the larger
enterprise and to draw relevant queries to itself.
@@ -824,7 +840,7 @@
tree. Using the optimization mentioned above regarding strings
longer than 1 character or
bit, the solution to the maximum dominator problem is trivial -- it
is simply the first
node below the root. The second type of summarization is to
determine "K-dominators".
- Essesntially, for a given target K, we produce the most appropriate
summarizing nodes.
+ Essentially, for a given target K, we produce the most appropriate
summarizing nodes.
While this problem is NP-complete, we can construct an approximation
heuristic that
simply considers the length of the strings in the internal (or
structural) nodes of
the tree. We leave for future work the problem of "Min cost
dominators", in which the
@@ -954,35 +970,7 @@
</section>
-<!-- Bootstrapping Section -->
-
- <section anchor="bootstrapping" title="Bootstrapping">
- <t>A distributed information system such as the LS needs to address
- bootstrapping. In this system, an LS instance needs to find other
- members of its scope (for each scope in which it participates.) To
- accomplish this we will use a similar solution to what DNS uses
- (root.hints).
- </t>
- <t>We will maintain a service that maintains a list of currently known
- LS instances. These known instances should preferably be at the
upper
- scope. All clients can cache this list. The service will be accessed
- via a well-known hostname, and could be requested via UDP messages.
- (We can also use TCP here for some sorts of anycast.)
- </t>
- <t>Initially this will be deployed on one server. We can extend this to
- handle redundancy and load balancing in the future by using multiple
- DNS records and implementing ANYCAST with routing tricks for this
well
- known hostname. (Additionally, we can distribute an initial file
with
- a list of well known LS instances that are supported by the primary
- perfSONAR participants.)
- </t>
- <t>The above discovery algorithm is used to find an LS within a given
- scope. Therefore, the only piece of information an LS should need to
- be pre-configured with is the scope it belongs to. And as stated
above,
- that can be assumed to be "global:organization-dns-name". Note: Need
- to define the specific syntax above.
- </t>
- </section>
+
<!-- Examples Section -->
<section anchor="structures-and-messages" title="Structures and Messages">
<section anchor="service-metadata" title="Service metadata example">
@@ -1049,8 +1037,8 @@
<t>
The response message should indicate success or failure via
the eventType, and will contain metadata/data pair(s). The
metadata
- should indicate who the service is, and its "size" for voting
- purposes. The data section is an enumeration of all of the
current
+ should indicate who the service is. The data section is an
+ enumeration of all of the current
members of the ring and their votes. This information gives the
new member a snapshot of the ring.
</t>
@@ -1109,14 +1097,13 @@
This message exchange represents when an LS instance is holding
the token and sharing summary information (lower scope). The
message
consists of metadata/data pair(s) that contain service information
- and a parameter indicating "size" of the data set the LS
- manages as well as the minimal (without parameters) summary.
+ as well as the minimal (without parameters) summary.
</t>
<t>
The response message should indicate success or failure via
the eventType, and will contain metadata/data pair(s). The
metadata
- should indicate who the service is, and its "size" for leader
voting
- purposes. The data section is message that can be used for
logging.
+ should indicate who the service is. The data section is message
+ that can be used for logging.
</t>
<t>
When receiving the message, check your 'lower' list and update it
as
@@ -1151,15 +1138,14 @@
<t>
This message exchange represents when an LS instance is holding
the token and sharing summary information. The message
- consists of metadata/data pair(s) that contain service information
- and a parameter indicating "size" of the data set the LS
- manages. The "data" portion is the summary info (FORMAT TBD!!!)
+ consists of metadata/data pair(s) that contain service
information.
+ The "data" portion is the summary info.
</t>
<t>
The response message should indicate success or failure via
the eventType, and will contain metadata/data pair(s). The
metadata
- should indicate who the service is, and its "size" for leader
voting
- purposes. The data section is message that can be used for
logging.
+ should indicate who the service is. The data section is message
+ that can be used for logging.
</t>
<t>
When receiving the message, check your 'lower' list and update it
as
@@ -1223,8 +1209,7 @@
</section>
</section>
<section anchor="LSControl-Discovery" title="LS Discovery Message">
- <t>Structure of the LSDiscovery Message used to locate info-sets.
- (FORMAT TBD!!!)</t>
+ <t>Structure of the LSDiscovery Message used to locate info-sets. </t>
<section anchor="LSDiscoveryRequest" title="Request">
<t>
<artwork>
@@ -1245,14 +1230,7 @@
</section>
</section>
</section>
-<!-- Result codes section -->
- <section anchor="codes" title="Result codes">
- <list style="symbols">
- <t>error.ls.foo - </t>
- <t>success.ls.foo - </t>
- <t>TBD</t>
- </list>
- </section>
+
<section anchor="apdx" title="Appendices">
<section anchor="glossary" title="Glossary">
<list style="symbols">
- nmwg: r309 - trunk/nmwg/doc/dLS, svnlog, 12/14/2007
Archive powered by MHonArc 2.6.16.