Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] help with configuration

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] help with configuration


Chronological Thread 
  • From: Mark Feit <>
  • To: Petr Šesták <>, "" <>
  • Subject: Re: [perfsonar-user] help with configuration
  • Date: Wed, 7 Aug 2024 15:04:21 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=internet2.edu; dmarc=pass action=none header.from=internet2.edu; dkim=pass header.d=internet2.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DIvL7Vo7bYP1ccV+bgosllibe9gBrmSM4B9PxO+/XaI=; b=QpPmnbjX7nkLLfBJbDkNhTSTahHxcVJtwOXjXmyj8eJ262UA4EGOdKONcGBlrrjmlbELBEOxRCuaTmV9cDSr+YTB2mgQ3ekdw5DxvVKHSG/tqfCm0OnkFUN7VOZUWLv5LH/LfSme9hmnCpcJXIq/mCEchPjC3KyKmVhrGbgGiQLrNsmu7m7Z1ghIkscHOfvtfbK/jdSt4XruXmHqG6qVE0PzNrF4Lxe3LCfIGam1Gveh8xtHEBpEQJcTzepQ3jM8KMMqcSNMe4rrptkMyLIJ4hRrWJvCRzOtxIhRU7p4UhjMz49fbRl1ySU9MLT+Cabp0Plwtf1sEL9wNkT//kxawg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CNvN3lZIyfibMF01x+5CRyTldVv67dxRogT8IvnzZ+yqKrZWlDY+U1qkv5qpbwMGKrk2rUzchluL1QheIODZZ7+UFXQgalddd+O3D1zuJFoNH6g0l9eGPJkhR9TPaoVAdoovw4rlt3KusPbg3dXcDaZgxEUYPKtzqknQNnuntbQZ6CjL1pQ2laoWRFBjPP+M/aZsn5+C0SypIABmVMUkXJk5UCSYWtt1MqyeGpYxjMFrXKtMLrx8SeDAm2ozLckzB2R8E4kxjT/Eo9cPBNBALqJmALEkVqIwovPIKWupk1LooMVU6P/ihh+IR9+RvLqsRc6ITlJPpui/hz04F/R7sA==

Petr Šesták writes:

 

Aug 05 10:26:13 ps01-l.farm.particle.cz runner[2405952]: runner WARNING

39765: Unable to retrieve run

400: Operation timed out after 14500 milliseconds with 0 bytes received

 

The host at the other end (dmz-perf.uark.edu) or the network between it and your system has problems.  I’ll cover that below.

 

Aug 05 10:27:54 ps01-l.farm.particle.cz scheduler[2398130]: scheduler

INFO     23438: Posting non-starting run at 2024-08-05T19:21:56+02:00

for task e68c4cfa-6133-48eb-8ed9-b4cfbddf5a51: Gave up after too many

scheduling conflicts.

 

This happens when, after repeated attempts to schedule a run where a conflict arose during the scheduling process, the scheduler elected to stop trying.  It should be a very-rare event.

 

This task was still in your system, so I pulled it out through the API and determined that the other end is the system at UArk.  After running a couple of tests and other checks, I noticed that its clock is not synchronized and has drifted by nearly nine minutes:

 

$ pscheduler clock ps01-l.farm.particle.cz  130.184.244.8

clock      2024-08-07T15:05:34.982705+02:00 synchronized   ps01-l.farm.particle.cz

clock      2024-08-07T08:56:50.204645-04:00 unsynchronized 130.184.244.8

difference PT8M44.77806S unsafe

 

That will make it unable to properly run tests involving multiple pSchedulers (i.e., throughput).    The clock on your system is in correct sync (checked against one of mine).  The UArk system also appears to have some intermittent problems being able to schedule and run tests.   Based that, I’d say your system is fine and that you should disregard any errors that are a result of tasks involving the UArk system.

 

 

Aug 03 19:50:33 ps01-l.farm.particle.cz archiver[1055237]: archiver

WARNING  3204: Failed to archive

to http: Possibly-permanent archiver error: Program failed to start

after 3 tries: Timed out waiting for read

 

This task is no longer in your system, so I wasn’t able look at what it was trying to do.  If you can point me at something from the last 48 hours, I can take a look at it.

 

Hope that helps.

 

--Mark

 

 




Archive powered by MHonArc 2.6.24.

Top of Page