Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Problem -- throughput/owamp service graphsnot showing

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Problem -- throughput/owamp service graphsnot showing


Chronological Thread 
  • From: Zafar Gilani <>
  • To:
  • Cc: , Performance Node Users <>
  • Subject: Re: [perfsonar-user] Problem -- throughput/owamp service graphsnot showing
  • Date: Mon, 29 Nov 2010 22:16:55 +0500
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; b=uQrpaLxqqgWYFUjzQa44F90HWQQLcwigJu4AMVJhXC6+3n8SonZm9vT0vZCi5ciS4d 5kZXLI6EEOIn3u3q/9ZEgnob9fsXv5CzKz2Ta5GhdM2lvqS6OvXfKPZpTFAltgIyWatX 9L0oJJKN74upjKsRsJheZPzcmV7iDPrv+A/vw=

Hi Jason,

Thanks for telling me this, I didn't take this into account. I'm already able to see some graphs. I've also replaced NTP servers with the following Stratum 1 NTP servers:

clock.cuhk.edu.hk (Hong Kong)
clock.nc.fukuoka-u.ac.jp (Japan)
clock.tl.fukuoka-u.ac.jp (Japan)
jamtepat.singnet.com.sg (Singapore)
nets.org.sg (Singapore)

These are taken from http://kopernix.com/?q=node/23

On Mon, Nov 29, 2010 at 6:57 PM, Jason Zurawski <> wrote:
Hi Zafar;

The 'pool' servers may be better than using the US based servers when it comes to the latency metric, but these are still not great in other respects.  The way the NTP pool works is to provide you with a potentially different server (though the generic DNS name) each time your server requests a time heartbeat.  This usually translates into high jitter, which is also not good for the measurement tools.  I think it would be best if you could try to find some stable stratum 1 or stratum 2 clocks in the region that you could directly poll instead of using the pool.

Let us know how the rest of the testing goes when you are able to SSH or otherwise access the machines.

Thanks;

-jason


On 11/27/10 1:14 AM, Zafar Gilani wrote:
I've actually found and added some Asian NTP servers:

pk.pool.ntp.org <http://pk.pool.ntp.org>
0.asia.pool.ntp.org <http://0.asia.pool.ntp.org>
1.asia.pool.ntp.org <http://1.asia.pool.ntp.org>
2.asia.pool.ntp.org <http://2.asia.pool.ntp.org>
3.asia.pool.ntp.org <http://3.asia.pool.ntp.org>


On Sat, Nov 27, 2010 at 11:09 AM, Zafar Gilani <
<mailto:>> wrote:

   Hi Jason,

   I've changed the NTP servers to closest ones but don't know of the
   closest to Pakistan. I will try to get the list and add those as
   replacement. We've blocked SSH on these machines so I won't be able
   to run bwctl tests remotely. Will let you know on these. Thanks for
   all the help! I really appreciate this.


   On Fri, Nov 26, 2010 at 6:06 PM, Jason Zurawski
   < <mailto:>> wrote:

       Hi Zafar;


       On 11/26/10 6:32 AM, Zafar Gilani wrote:

           Hi Jason,

           Checked server time, it seems to be fine.



       One thought that springs to mind is that these hosts are
       physically located in Pakistan?  The 'default' NTP servers that
       come with the toolkit are located in the US on R&E networks.
         The NTP peer status can be found in this manner:

           [root@lab236 ~]# ntpq -p
                remote           refid      st t when poll reach
           delay   offset  jitter
           ==============================================================================
           -otc2.psu.edu <http://otc2.psu.edu>    147.84.59.145    2 u

             565 1024  377   23.018    0.010   0.110
           *navobs1.oar.net <http://navobs1.oar.net> .USNO.           1

           u  697 1024  377    7.271   -0.125   0.093
           +2001:468:1:12:: 130.207.244.240  2 u  669 1024  377
           27.101   -0.005   0.137
           -2001:468:2:12:: 64.57.16.34      2 u  590 1024  377
           30.220   -0.251   0.081
           +2001:468:6:12:: .PPS.            1 u  611 1024  377
           18.816   -0.165   0.076


       In the case above, the 'delay' field shows how close the target
       NTP servers are to my host.  Having these be small values is
       good.  The latency to synchronize over the distance from the US
       to Pakistan (more than 300ms when I did a ping) will have an
       effect on the accuracy of the measurement tools.  This could
       contribute to the error you report below.

       If you know of NTP servers that are 'closer' it would be good to
       add them to the NTP configuration file manually.  Edit
       '/etc/ntp.conf' to add lines like this:

         server NTP_SERVER_IN_PAKISTAN iburst

       And either comment out or delete the other server lines.  Try to
       find at least 3 (preferably 5) in that region of the world.
       Restart NTP (sudo /etc/init.d/ntpd restart) after making any
       changes.



           The databases have data for
           both owamp and bwctl tables for LA and BW nodes
           respectively. However
           the tests return the following:

           For bwctl "remote peer cancelled test". However it does say
           server
           listening on xy port and binding to local address. For owamp
           tests I see
           first and last results but it also says 0 sent, 0 lost. This
           is strange.

           What do you think might be the problem? I do see data which
           suggests the
           firewall might not be the culprit.



       If you see data, then firewalls are probably not to blame.  Try
       to implement the suggestion to NTP above, but you can also try
       this test immediately:

         bwctl -c BW_HOST_IN_SCHEDULE -a 5
         bwctl -s BW_HOST_IN_SCHEDULE -a 5

       The 'a' option tells BWCTL to allow an extra bit of offset (+ or
       - the specified value) when calculating the measurement
       start/stop.  Its normally used in cases where the clock may be
       off between hosts.  Raise the value to 10 or even 20 and see if
       that allows the tests to complete.

       I also think upgrading to 3.2 is a good idea, but lets solve
       this problem first :)

       Thanks;

       -jason


           On Thu, Nov 25, 2010 at 10:31 PM, Jason Zurawski
           < <mailto:>
           <mailto:
           <mailto:>>> wrote:

               Hi Zafar;


               On 11/25/10 12:26 PM, Zafar Gilani wrote:

                   Hi Jason,

                   Thanks for the reply. I'm not able to access the
           machines from
                   my home,
                   I'll try tomorrow morning and let you know.
           Possibility of a
                   firewall
                   also comes to my mind. I also checked system clock
           and it seemed
                   to be
                   correct. Addresses:

                   BW: 115.186.131.105
                   LA: 115.186.131.107



               I am able to reach both, one thing I am noticing is that
           these are
               both 3.1.2.  I would suggest upgrading to 3.2 if at all
           possible.

               In any event, report back if you are seeing data when
           you get a
               chance - also if you suspect a firewall may be an issue,
           try some
               simple tests to some of the hosts that are in the
           regular testing
               groupings, e.g.:

                 owping HOST_IN_LATENCY_TESTS
                 bwctl -c HOST_IN_BW_TESTS
                 bwctl -s HOST_IN_BW_TESTS

               If these fail for any of the hosts, firewalls may be an
           issue.

               Thanks;

               -jason


                   These machines are deployed at NUST, Islamabad,
           Pakistan.

                   On Thu, Nov 25, 2010 at 5:48 PM, Jason Zurawski
           < <mailto:>
           <mailto: <mailto:>>
           <mailto:
           <mailto:>
           <mailto:
           <mailto:>>>>

                   wrote:

                       Hi Zafar;

                       I am CCing the performance-node-users list on
           this response.
                         Comments below:


                       On 11/25/10 5:57 AM, Zafar Gilani wrote:

                           Hi,

                           I've deployed new BW and LA PS nodes but
           can't seem to
                   view any
                           graphs.
                           Whenever I go to Service Graphs ->
           Throughput or Service
                   Graphs ->
                           One-Way Latency, I get the following:

                           Problem Handling Request
                           MA
           */http://localhost:8085/perfSONAR_PS/services/pSB/*
                   did not
                           return
                           the expected response, be sure it is
           configured and
                   populated
                           with data.

                           For OWAMP I try to view graph and receive
           following:


                               Internal Error - Service returned data,
           but it is not
                           plotable for
                               this measurement pair.

                           Anyone having any ideas? I've already
           checked the
                   scheduled tests.
                           Everything seems to be correct. I also checked
           http://psps.perfsonar.net/toolkit/FAQs.html#Q19 but I don't
                           think this
                           is the case as it has been more than 24
           hours since the
                   nodes
                           were deployed.


                       What is the address of this performance node, is
           it world
                   accessible
                       or does it have a local address?

                       I would suggest looking in the database by hand
           to see if
                   there is
                       recent data:

                          mysql -u root

                       Once at the mysql prompt try these commands:

                          use owamp;
                          show tables;

                       Look for a table that has data for 20101125_* or
           similar.
                     Try the
                       same for bwctl:

                          use bwctl;
                          show tables;
                          exit;

                       There should be data for 201011_*.  If you don't
           see recent data
                       tables, you are not getting data from the
           regular tests
                   (even though
                       you did set them up).  This could be because
           your system
                   clock is
                       not correct/not running, firewalls are
           preventing the
                   BWCTL/OWAMP
                       tests, or perhaps some other reason.

                       Let us know what you see and we can debug
           further, thanks;

                       -jason




Archive powered by MHonArc 2.6.16.

Top of Page