Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Latency data not displayed in chart after 3.5-1 upgrade

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Latency data not displayed in chart after 3.5-1 upgrade


Chronological Thread 
  • From: Andrew Lake <>
  • To: Michael Johnson <>, "Christopher J. Tengi" <>
  • Cc: perfsonar-user <>, Hyojoon Kim <>
  • Subject: Re: [perfsonar-user] Latency data not displayed in chart after 3.5-1 upgrade
  • Date: Mon, 14 Mar 2016 16:33:40 -0400

Hi all,

I think we have a fix for this as well. The problem was that only the first 1000 results were getting returned to the graphs. When dealing with 24 hours worth of 1 minute summaries for latency data, that’s 1440 points so things got cut-off. I just pushed out a new esmond RPM that should return the rest of the results. It should rsync its way into the main yum repo in the next 30 minutes and then work its way through the rest of the mirrors over the next day or so. the new version is 2.0.2. As soon as you get that RPM your graphs should look correct again with no further action.

Thanks,
Andy



On March 11, 2016 at 9:32:23 AM, Christopher J. Tengi () wrote:

Michael,
The interesting thing, to me, is that ping data is graphing just fine. It is the latency and loss data that is problematic. Attached are 2 images of 1-day graphs generated on one of our endpoints that is testing against a machine at Yale. The graph is bing generated on the toolkit host perfsonar-87prospect.princeton.edu, using the local esmond to generate the graph. Note that on the 1-day graph from today (where latency is predominant), the latency and loss numbers stop around 02:00. Clicking on the “Previous 1d” link generates a graph (with loss predominant) with a similar cut-off around 02:00. The ping data just chugs on steadily through both graphs. That just seems odd to me.

/Chris






On Mar 10, 2016, at 4:11 PM, Michael Johnson <> wrote:

Hi Joon,

Thanks for following up with more information -- we're still trying to figure out where the problem is. We will most likely wait until next week, since one developer who I want to discuss this with is gone this week. 
Thanks,
Michael

On Wed, Mar 09, 2016 at 11:43:06PM +0000, Hyojoon Kim wrote:
Hello Michael,

Thanks for filing a bug report!

In the MA machine:
I have not found any error messages in /var/log/cassandra/cassrandra.log or  /var/log/cassandra/system.log. However, I did find some interesting errors in /var/log/esmond/django.log and /var/log/esmond.log.  I am attaching them. One interesting error message in django.log is
OverflowError: list size out of the sanity limit (10000 items max)

One thing to mention is, the error message above is only found in the MA machine. But, the charts viewed directly from the test point node (which has the toolkit installed) also shows the same display issue. (The test point node sends data both to the MA machine’s esmond and also to its local esmond). However, the test point node with the toolkit does *not* have any error messages in any of the log files mentioned above.

Thanks,
Joon




On Mar 9, 2016, at 5:12 PM, Michael Johnson <<mailto:>> wrote:

Hi Hyojoon,

Thanks for your bug report. We're looking into it. Looking at your MA, it looks like you're getting 500 errors for some requests, i.e.:

http://perfsonar-ma-2.princeton.edu/esmond/perfsonar/archive/dfbfe496326a49d1a55219ac831f9b26/histogram-owdelay/statistics/0?format=json

Do you see any interesting errors in /var/log/cassandra/cassandra.log or /var/log/cassandra/system.log? These could be preventing the graphs from getting all the necessary data.
I suspect there are some issues with your MA's summary windows, although it's not quite clear yet where the problem lies. I've opened a bug report here:
https://github.com/perfsonar/graphs/issues/33

I put it with the graphs, even though it's not clear yet whether it's a problem with the charts themselves, or esmond.

Thanks,
Michael

On Wed, Mar 09, 2016 at 08:25:22PM +0000, Hyojoon Kim wrote:
Hi,

After an upgrade to v3.5-1, the chart that shows measured data seems to have a display issue. To be more specific, the chart does not display all available ‘Latency’ data it has *when zoomed to 1 day* by mouse *click* on ‘1d’ above the chart (not by window dragging, which is below the chart).

For example, the following links show charts that display measurement data between two nodes (128.112.228.24 and 128.112.228.23). The query was made on March 9, 2016, at around *2:54 pm*.  However, when zoomed to *1 day*, the display cuts off at around 7:33 am. When zoomed to *3 day*, the data is displayed up to roughly 2:54pm.

In other words, zooming to *1d* seems to cut off displayed latency data, even though the data *is* actually there.  Note that ‘Ping’ data is not cut off.

Zoom: 1 day (1d)
https://perfsonar-ma-2.princeton.edu/perfsonar-graphs/graphWidget.cgi?url="https://perfsonar-ma-2.princeton.edu/esmond/perfsonar/archive&source=perfsonar-hpcrc-delay.princeton.edu&dest=perfsonar-87prospect-delay.princeton.edu#timeframe=1d
Zoom: 3 day (3d)
https://perfsonar-ma-2.princeton.edu/perfsonar-graphs/graphWidget.cgi?url="https://perfsonar-ma-2.princeton.edu/esmond/perfsonar/archive&source=perfsonar-hpcrc-delay.princeton.edu&dest=perfsonar-87prospect-delay.princeton.edu#timeframe=3d

* Zoomed to between 6am - 2:54pm by dragging the window with mouse, so that the difference between 1d and 3d is more obvious (i.e., 1d zoom is missing latency data)

Zoom: 1 day
https://perfsonar-ma-2.princeton.edu/perfsonar-graphs/graphWidget.cgi?url="https://perfsonar-ma-2.princeton.edu/esmond/perfsonar/archive&source=perfsonar-hpcrc-delay.princeton.edu&dest=perfsonar-87prospect-delay.princeton.edu#timeframe=1d&zoom_start=1457520917158&zoom_end=1457553245000
Zoom: 3 day
https://perfsonar-ma-2.princeton.edu/perfsonar-graphs/graphWidget.cgi?url="https://perfsonar-ma-2.princeton.edu/esmond/perfsonar/archive&source=perfsonar-hpcrc-delay.princeton.edu&dest=perfsonar-87prospect-delay.princeton.edu#timeframe=3d&zoom_start=1457520917158&zoom_end=1457553245000

Any help or insight would help!


Thanks,
Joon

--
Michael Johnson
GlobalNOC Software Engineering
Indiana University

812-856-2771






-- 
Michael Johnson
GlobalNOC Software Engineering
Indiana University

812-856-2771





Archive powered by MHonArc 2.6.16.

Top of Page