perfsonar-dev - Re: SA6 monitoring
Subject: perfsonar development work
List archive
- From: Nicolas Simar <>
- To:
- Cc: Guilherme Fernandes <>, Roman Lapacz <>, Maciej Glowiak <>, WiN-Labor <>, , "" <>, Roberto Sabatino <>
- Subject: Re: SA6 monitoring
- Date: Tue, 15 Jan 2008 10:38:49 +0000
Hi Andras,
more general questions:
- I am working under the assumption that SA6 does the development work. Can you please confirm it is your understanding too.
- The support provided by JRA1 must be clarify.
- What is your timescale and resources?
Cheers,
Nicolas
Andras Kovacs wrote:
Hi Nicolas,
Alright, meet you there. Let's go through your 11 questions, as they will make clear everything.
Thanks.
Andras
Nicolas Simar írta:
Hi,
the call will be held tomorrow, Wednesday from 15:00 till 16:00.
We will use the Hungarnet MCU,
* GDS numbering plan JRA1: 0036100309901
* Phone access: +36 1 450 3099 then type the GDS number 0036100309901
The objectives is to get a better understanding of SA6 monitoring plans and lay down a series of questions and next step to allow the activity to go forward.
Best regards,
Nicolas
Andras,
out of what you have described before, I have identified a serie of elements that are required. The list of point below should enable us to go forward into investigating the solution. I am afraid that at this stage, the information is too thin (but that's normal, we are starting ;-) ) Note that the specifications must be done independently of the technology (no mention of perfSONAR to start with)
The Voice measurement tools:
----------------------------
- A server that listen to call and register to gatekeepers.
- A client that sends calls to servers.
- Test load CPU/memory/number of packets send by the machines.
On-demand tests
---------------
1. Creating an application/ a CLI that triggers on-demand tests from a client to a server. Users' Authentication required. Define test frequency.
2. Extend an MP to wrap the client tools that triggers on-demand tests. Authorization required to have a test triggered. Identify security threads.
3. Pass the information back to the client.
4. Optional: Create a perfSONAR schema (i) to trigger on-demand tests, (ii) to push the information to a DB, (iii) to request and get the data from an MA. The information exchanged is status (up/down) and error information (?).
Regularly scheduled tests and display
--------------------------------------
5. Create a GUI.
What type? How to access it? What to show?
6. Create a scheduler to schedule regular tests for the clients.
7. Pass the data back to the DB. Creation of the DB table and internals.
8. Optional: Wrapping the DB into an Measurement Archive web-service.
Operational Model
----------------
Derived from the service model.
9. Define the operational model: software support, deployed tool support.
10. Define the service reports
11. Monitor the monitoring infrastructure.
Andras Kovacs wrote:
Hi Nicolas,
Thanks a lot for your answer. Here is mine, see inline:
a) What test tool do you plan to use to monitor H.323 and SIP connectivity?
At this moment we have to candidates:
1. Asterisk
2. A small test tool built on Opal VoIP libraries.
Asterisk could be a better choice since it is able to accept more calls at the same time (we have to think of concurrent measurements).
b) Can you describe how the test tool work:
- how it is triggered (by a user)
- how does work the tests?
- how is a test triggered? (is there the need to synchronise both side of connection? or can a test be launched from one side? )
In every country we need a test tool to be registered (just like you would do it with a normal H.323 or SIP client) to the particular NREN's infrastructure (gatekeeper, proxy). The tool has a server side part and a client side part. The server should listen at a given port to receive/terminate clients' calls.
A measurement between two countries should be automatedly triggered, let's say once in every hour. However, we would like to have on demand measurement possibilities for network admins (one hour resolution with the connectivity tests are just not enough for debugging).
- how many packets are exchanged during a test
It is difficult so day exactly, but not many as only connectivity/signalling (is a specific country reachable from my country?) is checked. We do not want to test bandwidth available for video calls.
In the very first phase, only SA6 countries will join this connectivity measurement, which will require us to observe max. 15 countries. In practice, it would mean that each MP should build up 14-15x2 (for both H.323 and SIP separately) connections to all the servers placed in involved countries. This would require 30 calls per our from an MP. If you project this to the whole system, ~15x30 calls should be done in every hour. One call setup requires 5-10 packets as a maximum.
Later, when we have a stable/converged measurement method, we could open this up for other countries. At this momement, GDS has approx. 30 member countries.
A diagram similar to : http://e2epi.internet2.edu/owamp/details.html or http://e2epi.internet2.edu/bwctl/architecture.html would be helpful.
Yes, we would like to see something similar. The Australian NREN has a VoIP monitoring facility, you can check it here:
http://lattice.act.aarnet.net.au/VoIPMonitor/
Although, we would like to have each cell clickable with a connectivity history graph (coming from the RRD).
- The CLI MP is developed by Guilherme Fernandez from RNP.
It seems at this stage a reasonable choice. (another possible choice might have been the BWCTL MP). We will be able to say more once we got a better understanding of how work the tool you plan to use.
Yes, we will provide you some of the CLI output of this tool quite soon.
- I would wish to understand a little bit better the operational model you have in mind for those tools.
It depends where we could deploy those MPs and related components. Do you think of the maintainers of these tools?
a) The MP gives you the ability to trigger an on-demand tests.
Yes, we would like to make use of this functionality.
b) Who are the consumer of the data? (please feel free to redirect me to any of your internal documents :-) )
Video-conferencing administrators from the NRENs, from a university, the video-conferencing users?
It is very simple: the country matrix would be open for all, on demand test would be available only for a restricted number of admins.
c) How will they access the information? (what visualisation tool, you hinted one below) How will it be the data represented to them?
You mention a clickable weathermap. Can you provide a little bit more information about this.
I don't know if you have complete codes for building such a matrix, but if not, we have to do the coding. Itt would look the following:
CH CZ GR HU UK ...
CH - OK OK X OK
CZ OK - OK X OK
GR OK OK - OK OK
HU X X OK - OK
UK OK OK OK OK -
...
You could e.g. click to each cell to see connectivity history. We would like to have this matrix for both protocols, maybe both could be put in the same matrix.
(please see a status page: http://perfsonar.acad.bg/status/ and how we might be building another similar page for the network services at a later stage: http://wiki.geant2.net/bin/view/SA3/SaThreeAppNrenStatusPage)
I have seen the first one earlier, we would like to achieve nearly the same (in principles). A slighly modified version would do the job. At the page http://wiki.geant2.net/bin/view/SA3/SaThreeAppNrenStatusPage the picture is missing.
d) Who will be triggering the tests?
Video-conferencing administrators from the NRENs, any video-conferencing users, an automated scheduler (for regular testing, the scheduler sends at regular interval on-demand tests to the MP)?
Automated tests is what we prefer (described above). On demand testing is also needed, but would be enough to implement it later if requires a lot of work.
e) How many of those tools do you foresee deployed within each network?
We need the following:
1. A server app that could terminate the calls (Asterisk is able to do it with both H.323 and SIP protocols).
2. A client app/script being able to initiate calls and to give a sufficient CLI output (at this moment I don't know if Astersik could do this).
3. The aboves will require a set of libraries to be installed. Once we have the final set of tools, we can figure out these libraries.
f) Where do you need to locate them? (anywhere in the network, next to MCU or gatekeepers?)
Once they are depoyed somewhere in the NREN network, it does not matter, since the tools must be registered with proxies, gatekeepers, etc. The gatekeeper/proxy will do the IP<->number resolution and provide the right IP addresses. In principle, the location of gatekeeper/proxy at which the tool is registered is the only thing that counts.
The bottomline questions are
- what do you want to observe?
Up/down status between countries. It is GDS (h.323) and nrenum.net (SIP) which is transferring number based calls between countires, if you like this allow the countries to have a VoIP/VC peering.
- what do you want to capture as change of behavior?
(I am clueless about video-conferencing monitoring)
Change in up/down status.
However, it is possible that we can measure the call setup time very easily. We haven't yet decided whether we would like to do this or not.
g) Do you foresee to have a single MA to which all the MP send the data for all the European networks? Of do you foresee that each network/country will have one MA to which all the national MP will be sending the data? (or any other combination).
This is not yet decided. What is your suggestion according to the aboves?
The problem is we don't have any type of this VC/VoIP measurements, so there are no NREN RRDs deployed for this job. Based on this, it might make sense to have a central RRD to store all the data. However, as far as I understand, perfSONAR allows a mixed solution as well? Is this right?
i) Can you specify what information you will be pushing to the RRD MA? (up/down or other informations?)
up/down
j) The RRD MA and the SQL MA have a write interfaces that allows an MP to push information to an MA.
OK.
k) An alternative to the RRD MA is the SQL MA. (an SQL MA is more appropriate to store status information than an RRD).
OK. But how about the long term? Isn't that so painful for an SQL DB to store all the status for years back? (even when having a lots of countries). Will we be able to generate a historical view of the connectivity when stored in a SQL DB?
l) How frequently will this tests be done.
At this moment, we think that one full measurement per hour would do the job. From point of a single MP, this would require initiation of a call to every other country.
Once we got a better understanding of how it works and the direction we will take, I foresee the following steps
- extend the NM-WG schema
In what way is it needed?
- integrate the measurement tool within the MP
- extend the MA (write interface, read interface and DB schema)
OK.
Any visualisation tool that can perform the querry to the MA to request your data would be able to display that information. Once you got the building blocks (MA, MP), it then become easy to build visualisation tool making use of those information.
Glad to hear this.
First step is to get an understanding of what the measurment tool does and how frequently it will be called. We need to investigate this with Erlangen.
OK. Is the above information enough for you? We could have a joint VC, if you think this is needed.
We got several types of PC we used for the MDM. One for the OWD and BWCTL measurments and anotherone that can be conveniently used to run those measurment. But it depends where you need to have the probe.
As mentioned, the location does not really matter.
Conveniently package and easy to deploy are terms that needs to be defined ;-)
OK, I thought of some DEB or RPM packaging.
- The actual status can be extracted from the RRD (last measured data).
You can do that (RRD or SQL see previous comment)
What do you think is it a good approach?
What other approach have you got in mind?
There are two options: RRD or SQL. We have nothing else in mind.
Should we log into a normal SQL DB as well, in order to allow a better way to extract the last measured data? This could be used as a long term measurement log (RRD cannot be used for this).
I am not sure I understand the question.
You already provided and answer by writing SQL is more suitable to store status information. RRD does some resampling of historical data, and your MA might not be able to extract the last measured data.
- A historical yes/no graph will be shown when the user clicks to a cell. This will come from the RRD directly.
I am confused (it's getting late). You are still mentioning the same clickable map?
Yes, of course. It might come from an SQL DB as well.
Yes it can be used to DoS (depending on how the tool works).
We can also wait to have one DoS before going along that road if you need to leave it open to anybody (it depends who is your user base). At this stage, the most important is to build a tool that fit the purpose. Security can be added once the basic goal have been demonstrated.
(note that I don't know if the CL MP uses at this stage authentication)
OK. We want to have the country matrix open, and let's follow this road. If something nasty occurs, it could also help you to make the architecture more stable, to identify DoS attacks, etc.
Yes, definitely, I am suggesting the 10th of January, at 15:00 CET? (or earlier).
I'll travel on 9-10th of January. How about Friday (11st) or Tuesday (8th, 15PM CET).
I would suggest to involve:
- Guilherme: CL MP developer
- Roman: RRD MA and SQL MA developers
- Maciej: invovled in JRA1 and SA6
- Erlangen: have experience wiht MP (BWCTL)
Fortunately, we have Maciej and PSNC in the SA6 group.
Thanks a lot.
Andras
--
Nicolas
______________________________________________________________________
Nicolas Simar
Network Engineer
DANTE - www.dante.net
Tel - BE: +32 (0) 4 366 93 49
Tel - UK: +44 (0)1223 371 300
Mobile: +44 (0) 7740 176 883
City House, 126-130 Hills Road
Cambridge CB2 1PQ
UK
_____________________________________________________________________
- Re: SA6 monitoring, (continued)
- Re: SA6 monitoring, Yee-Ting Li, 01/07/2008
- Re: SA6 monitoring, Andras Kovacs, 01/04/2008
- Re: SA6 monitoring, Nicolas Simar, 01/10/2008
- Re: [pS-dev] Re: SA6 monitoring, Fausto Vetter, 01/11/2008
- Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
- Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
- Re: [pS-dev] Re: SA6 monitoring, Andras Kovacs, 01/14/2008
- Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
- Re: [pS-dev] Re: SA6 monitoring, Andras Kovacs, 01/14/2008
- Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
- Re: [pS-dev] Re: SA6 monitoring, Fausto Vetter, 01/11/2008
- Re: SA6 monitoring, Nicolas Simar, 01/15/2008
- Re: SA6 monitoring, Andras Kovacs, 01/15/2008
- Re: SA6 monitoring, Nicolas Simar, 01/15/2008
- Re: SA6 monitoring, Andras Kovacs, 01/15/2008
- Re: SA6 monitoring, Nicolas Simar, 01/15/2008
- JRA1 + SA6 Monitoring cross-session, Andras Kovacs, 01/16/2008
- Re: SA6 monitoring, Nicolas Simar, 01/15/2008
- Re: SA6 monitoring, Andras Kovacs, 01/15/2008
- Re: SA6 monitoring, Nicolas Simar, 01/15/2008
- Re: SA6 monitoring, Andras Kovacs, 01/15/2008
- Re: SA6 monitoring, Nicolas Simar, 01/10/2008
Archive powered by MHonArc 2.6.16.