perfsonar-dev - Re: [pS-dev] Re: SA6 monitoring

Subject: perfsonar development work

List archive

Re: [pS-dev] Re: SA6 monitoring

From:
To: Fausto Vetter <>
Cc: Nicolas Simar <>, , Guilherme Fernandes <>, Roman Lapacz <>, Maciej Glowiak <>, WiN-Labor <>, , "" <>
Subject: Re: [pS-dev] Re: SA6 monitoring
Date: Sat, 12 Jan 2008 03:24:05 -0200

Hi,

I was looking the discussion about VoIP availability. In my point of view, there are ways to automate this process. I know a command line client for SIP, linphonec. I runs in linux and do calls. It is possible with this tool do a MP that iniciates the calls and terminates it. Of course, it is necessary to have 2 of this to act as clients, one calling another. Each of this client would be register with each SIP registrar, doing calls between each other. To make these information available it would be necessary to do an MA. The visualization tool could be the cacti, that already have something implemented with matrix. Of course, would be necessary to adapt it to this scope. One thing interessant about cactiSONAR is that is already correlate the measures of perfSONAR to form MOS data (VoIP Quality). It would be a plus on the need. Of course, what I am mentioned here is to find a solution to SIP protocol.

It is necessary to remember, that with this mp and ma it will be necessary to have a third tool that would do the synchronization between the calls. It is an idea, we can think better on it.

I hope this information could be helpfull,
Murilo

Quoting Fausto Vetter
<>:

Hi,

I have seen the discussion on it and I presented a tool I built on last
Google Summer of Code that retrieves data from perfSONAR MPs (only
CL-MP available tools for while: Ping, OWAMP and BWCTL. Ready to
integrate other perfSONAR services and tools) and store data on RRD
Files using CACTI as its base infra-structure. I think it may be
feasible to use to monitor the VoIP to SA6.

More details on the tool:
https://wiki.internet2.edu/confluence/display/GSoC/perfSONAR

or on perfsonar wiki:
http://wiki.perfsonar.net/jra1-wiki/index.php/CACTISonar-Pre-Release-Alfa
(old version though)

Hope to be useful,
Fausto

Quoting Nicolas Simar
<>:

Hi,

I am setting up a doodle pool to find a time and date (I can't make it
this Friday, I am traveling).

We will also have to come up with a list of points for discussions (I
guess they'll follow what's written below here).

http://www.doodle.ch/htimxk5qscam744b

- Guilherme: CL MP developer
- Roman: RRD MA and SQL MA developers
- Maciej: involved in JRA1 and SA6
- Erlangen: have experience with MP (BWCTL) and for the installation of
the service on their machines
- Andras: SA6
- Nicolas: JRA1

Cheers,
Nicolas,

Andras Kovacs wrote:

Hi Nicolas,

Thanks a lot for your answer. Here is mine, see inline:

a) What test tool do you plan to use to monitor H.323 and SIP connectivity?

At this moment we have to candidates:
1. Asterisk
2. A small test tool built on Opal VoIP libraries.

Asterisk could be a better choice since it is able to accept more calls at the same time (we have to think of concurrent measurements).

b) Can you describe how the test tool work:
- how it is triggered (by a user)
- how does work the tests?
- how is a test triggered? (is there the need to synchronise both side of connection? or can a test be launched from one side? )

In every country we need a test tool to be registered (just like you would do it with a normal H.323 or SIP client) to the particular NREN's infrastructure (gatekeeper, proxy). The tool has a server side part and a client side part. The server should listen at a given port to receive/terminate clients' calls.

A measurement between two countries should be automatedly triggered, let's say once in every hour. However, we would like to have on demand measurement possibilities for network admins (one hour resolution with the connectivity tests are just not enough for debugging).

- how many packets are exchanged during a test

It is difficult so day exactly, but not many as only connectivity/signalling (is a specific country reachable from my country?) is checked. We do not want to test bandwidth available for video calls.

In the very first phase, only SA6 countries will join this connectivity measurement, which will require us to observe max. 15 countries. In practice, it would mean that each MP should build up 14-15x2 (for both H.323 and SIP separately) connections to all the servers placed in involved countries. This would require 30 calls per our from an MP. If you project this to the whole system, ~15x30 calls should be done in every hour. One call setup requires 5-10 packets as a maximum.

Later, when we have a stable/converged measurement method, we could open this up for other countries. At this momement, GDS has approx. 30 member countries.

A diagram similar to : http://e2epi.internet2.edu/owamp/details.html or http://e2epi.internet2.edu/bwctl/architecture.html would be helpful.

Yes, we would like to see something similar. The Australian NREN has a VoIP monitoring facility, you can check it here:

http://lattice.act.aarnet.net.au/VoIPMonitor/

Although, we would like to have each cell clickable with a connectivity history graph (coming from the RRD).

- The CLI MP is developed by Guilherme Fernandez from RNP.
It seems at this stage a reasonable choice. (another possible choice might have been the BWCTL MP). We will be able to say more once we got a better understanding of how work the tool you plan to use.

Yes, we will provide you some of the CLI output of this tool quite soon.

- I would wish to understand a little bit better the operational model you have in mind for those tools.

It depends where we could deploy those MPs and related components. Do you think of the maintainers of these tools?

a) The MP gives you the ability to trigger an on-demand tests.

Yes, we would like to make use of this functionality.

b) Who are the consumer of the data? (please feel free to redirect me to any of your internal documents :-) )
Video-conferencing administrators from the NRENs, from a university, the video-conferencing users?

It is very simple: the country matrix would be open for all, on demand test would be available only for a restricted number of admins.

c) How will they access the information? (what visualisation tool, you hinted one below) How will it be the data represented to them?
You mention a clickable weathermap. Can you provide a little bit more information about this.

I don't know if you have complete codes for building such a matrix, but if not, we have to do the coding. Itt would look the following:

CH CZ GR HU UK ...

CH - OK OK X OK

CZ OK - OK X OK

GR OK OK - OK OK

HU X X OK - OK

UK OK OK OK OK -

...

You could e.g. click to each cell to see connectivity history. We would like to have this matrix for both protocols, maybe both could be put in the same matrix.

(please see a status page: http://perfsonar.acad.bg/status/ and how we might be building another similar page for the network services at a later stage: http://wiki.geant2.net/bin/view/SA3/SaThreeAppNrenStatusPage)

I have seen the first one earlier, we would like to achieve nearly the same (in principles). A slighly modified version would do the job. At the page http://wiki.geant2.net/bin/view/SA3/SaThreeAppNrenStatusPage the picture is missing.

d) Who will be triggering the tests?
Video-conferencing administrators from the NRENs, any video-conferencing users, an automated scheduler (for regular testing, the scheduler sends at regular interval on-demand tests to the MP)?

Automated tests is what we prefer (described above). On demand testing is also needed, but would be enough to implement it later if requires a lot of work.

e) How many of those tools do you foresee deployed within each network?

We need the following:
1. A server app that could terminate the calls (Asterisk is able to do it with both H.323 and SIP protocols).
2. A client app/script being able to initiate calls and to give a sufficient CLI output (at this moment I don't know if Astersik could do this).
3. The aboves will require a set of libraries to be installed. Once we have the final set of tools, we can figure out these libraries.

f) Where do you need to locate them? (anywhere in the network, next to MCU or gatekeepers?)

Once they are depoyed somewhere in the NREN network, it does not matter, since the tools must be registered with proxies, gatekeepers, etc. The gatekeeper/proxy will do the IP<->number resolution and provide the right IP addresses. In principle, the location of gatekeeper/proxy at which the tool is registered is the only thing that counts.

The bottomline questions are
- what do you want to observe?

Up/down status between countries. It is GDS (h.323) and nrenum.net (SIP) which is transferring number based calls between countires, if you like this allow the countries to have a VoIP/VC peering.

- what do you want to capture as change of behavior?
(I am clueless about video-conferencing monitoring)

Change in up/down status.

However, it is possible that we can measure the call setup time very easily. We haven't yet decided whether we would like to do this or not.

g) Do you foresee to have a single MA to which all the MP send the data for all the European networks? Of do you foresee that each network/country will have one MA to which all the national MP will be sending the data? (or any other combination).

This is not yet decided. What is your suggestion according to the aboves?
The problem is we don't have any type of this VC/VoIP measurements, so there are no NREN RRDs deployed for this job. Based on this, it might make sense to have a central RRD to store all the data. However, as far as I understand, perfSONAR allows a mixed solution as well? Is this right?

i) Can you specify what information you will be pushing to the RRD MA? (up/down or other informations?)

up/down

j) The RRD MA and the SQL MA have a write interfaces that allows an MP to push information to an MA.

OK.

k) An alternative to the RRD MA is the SQL MA. (an SQL MA is more appropriate to store status information than an RRD).

OK. But how about the long term? Isn't that so painful for an SQL DB to store all the status for years back? (even when having a lots of countries). Will we be able to generate a historical view of the connectivity when stored in a SQL DB?

l) How frequently will this tests be done.

At this moment, we think that one full measurement per hour would do the job. From point of a single MP, this would require initiation of a call to every other country.

Once we got a better understanding of how it works and the direction we will take, I foresee the following steps
- extend the NM-WG schema

In what way is it needed?

- integrate the measurement tool within the MP
- extend the MA (write interface, read interface and DB schema)

OK.

Any visualisation tool that can perform the querry to the MA to request your data would be able to display that information. Once you got the building blocks (MA, MP), it then become easy to build visualisation tool making use of those information.

Glad to hear this.

First step is to get an understanding of what the measurment tool does and how frequently it will be called. We need to investigate this with Erlangen.

OK. Is the above information enough for you? We could have a joint VC, if you think this is needed.

We got several types of PC we used for the MDM. One for the OWD and BWCTL measurments and anotherone that can be conveniently used to run those measurment. But it depends where you need to have the probe.

As mentioned, the location does not really matter.

Conveniently package and easy to deploy are terms that needs to be defined ;-)

OK, I thought of some DEB or RPM packaging.

- The actual status can be extracted from the RRD (last measured data).

You can do that (RRD or SQL see previous comment)

What do you think is it a good approach?

What other approach have you got in mind?

There are two options: RRD or SQL. We have nothing else in mind.

Should we log into a normal SQL DB as well, in order to allow a better way to extract the last measured data? This could be used as a long term measurement log (RRD cannot be used for this).

I am not sure I understand the question.

You already provided and answer by writing SQL is more suitable to store status information. RRD does some resampling of historical data, and your MA might not be able to extract the last measured data.

- A historical yes/no graph will be shown when the user clicks to a cell. This will come from the RRD directly.

I am confused (it's getting late). You are still mentioning the same clickable map?

Yes, of course. It might come from an SQL DB as well.

Yes it can be used to DoS (depending on how the tool works).
We can also wait to have one DoS before going along that road if you need to leave it open to anybody (it depends who is your user base). At this stage, the most important is to build a tool that fit the purpose. Security can be added once the basic goal have been demonstrated.

(note that I don't know if the CL MP uses at this stage authentication)

OK. We want to have the country matrix open, and let's follow this road. If something nasty occurs, it could also help you to make the architecture more stable, to identify DoS attacks, etc.

Yes, definitely, I am suggesting the 10th of January, at 15:00 CET? (or earlier).

I'll travel on 9-10th of January. How about Friday (11st) or Tuesday (8th, 15PM CET).

I would suggest to involve:
- Guilherme: CL MP developer
- Roman: RRD MA and SQL MA developers
- Maciej: invovled in JRA1 and SA6
- Erlangen: have experience wiht MP (BWCTL)

Fortunately, we have Maciej and PSNC in the SA6 group.

Thanks a lot.

Andras

--
Nicolas
______________________________________________________________________

Nicolas Simar
Network Engineer

DANTE - www.dante.net

Tel - BE: +32 (0) 4 366 93 49
Tel - UK: +44 (0)1223 371 300
Mobile: +44 (0) 7740 176 883

City House, 126-130 Hills Road
Cambridge CB2 1PQ
UK
_____________________________________________________________________

--
Fausto Vetter
NPD/UFSC

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: SA6 monitoring, Roman Lapacz, 01/02/2008
- Re: SA6 monitoring, Andras Kovacs, 01/04/2008
  - Re: SA6 monitoring, Roman Lapacz, 01/04/2008
    - Re: SA6 monitoring, Andras Kovacs, 01/04/2008
    - Re: SA6 monitoring, Yee-Ting Li, 01/07/2008
- <Possible follow-up(s)>
- Re: SA6 monitoring, Andras Kovacs, 01/04/2008
  - Re: SA6 monitoring, Nicolas Simar, 01/10/2008
    - Re: [pS-dev] Re: SA6 monitoring, Fausto Vetter, 01/11/2008
      - Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
        
        Re: [pS-dev] Re: SA6 monitoring, murilo, 01/12/2008
        
        Re: [pS-dev] Re: SA6 monitoring, Andras Kovacs, 01/14/2008
      - Re: [pS-dev] Re: SA6 monitoring, Andras Kovacs, 01/14/2008
  - Re: SA6 monitoring, Nicolas Simar, 01/15/2008
    - Re: SA6 monitoring, Andras Kovacs, 01/15/2008
      - Re: SA6 monitoring, Nicolas Simar, 01/15/2008
        
        Re: SA6 monitoring, Andras Kovacs, 01/15/2008
        
        Re: SA6 monitoring, Nicolas Simar, 01/15/2008
        
        JRA1 + SA6 Monitoring cross-session, Andras Kovacs, 01/16/2008

List archive

Re: [pS-dev] Re: SA6 monitoring