Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Node not reporting to Maddash

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Node not reporting to Maddash


Chronological Thread 
  • From: Raul Lopes <>
  • To: Mark Feit <>, "" <>, Raul Lopes <>
  • Subject: Re: [perfsonar-user] Node not reporting to Maddash
  • Date: Wed, 7 Jul 2021 10:00:58 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=jisc.ac.uk; dmarc=pass action=none header.from=jisc.ac.uk; dkim=pass header.d=jisc.ac.uk; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GDURHSBvVSg/zf8n68iRFhtKZ0J03p4SHQBVz/2BRTs=; b=DdOXpsV1y5SEt8ewjpYcUXQLN6/4q1diw2IRGcv86FRNt6y2W+xNnUTPqsWqCLoLNIejqh0YItvptHYh53M7ISndvXV6GudRkoTqsrw4sJQmeYNKC4isn5LlqMKEnybBHk9TcVkezCLwHnz0v41Lvs9F+okyuuRAWtvBtBCac1+q/3czvASSAPP3w8Sdlemi4paLQFWZpdqdFdhBCdn4FqrC4UxuQNH9CNWSUIJuoMsnwPIRTDhhASeKBcmscevFZGCGrwTThlFiwrlHya97XWSqIq0C1v1jDbKd4s0fNOYL9CAXVUViN7a8e3U2/WAdljJhOKG2A1v89WacfetJIw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CcS4ucQoog+G5WFoGTy/3CFxt+q9P0z3N2C1UNq+yGnYkbjo8TPbTiVatEUJLHhdIemEb3zHHOPmqaGJYQ8NQczeqNc7+6/8J0LX+apQKC1rq0SWu2ggcFP9lilbrHoW7tcDCnQkKrht4Dfpyh2xD02kqTg0leobPryYbNWsyKDpDVJ0KvgJY4UhXMn/yjl5xXCy/8SjD3iOaH3KlZVQKXAkamByhoQrVrQHmmk/LDYj0kUnT8fusnjYaFy8twTLttWO0QcRdWEiIz4I1RS1dfakIVRkcba8H7j7zE/jvGD7zVk2u0wwXJizKNkjCqLlnZZ98E4QofG/9wyrmtDs2w==

Hi,

I've rebooted one of the nodes that is failing to publish. I wanted to see all services starting. I see strange errors in mkessages

Jul  7 10:53:10 ps07-em1 journal: ticker WARNING  Queue maintainer got exception server closed the connection unexpectedly
Jul  7 10:53:10 ps07-em1 journal: ticker WARNING  #011This probably means the server terminated abnormally
Jul  7 10:53:10 ps07-em1 journal: ticker WARNING  #011before or while processing the request.
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    Program threw an exception after -1 day, 23:00:11.484495
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    Exception: psycopg2.OperationalError: server closed the connection unexpectedly
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    #011This probably means the server terminated abnormally
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    #011before or while processing the request.
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    Traceback (most recent call last):
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR      File "/usr/lib/python3.6/site-packages/pscheduler/saferun.py", line 76, in safe_run
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR        function()
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR      File "/usr/libexec/pscheduler/daemons/runner", line 986, in <lambda>
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR        pscheduler.safe_run(lambda: main_program())
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR      File "/usr/libexec/pscheduler/daemons/runner", line 879, in main_program
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR        cursor.execute("SELECT heartbeat('runner')")
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    psycopg2.OperationalError: server closed the connection unexpectedly
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    #011This probably means the server terminated abnormally
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    #011before or while processing the request.
Jul  7 10:53:10 ps07-em1 journal: safe_run/runner ERROR    Waiting 0.25 seconds before restarting


Would anyone have clue?

Raul

From: <> on behalf of Raul Lopes <>
Sent: 06 July 2021 20:38
To: Mark Feit <>; <>
Subject: Re: [perfsonar-user] Node not reporting to Maddash
 
Hi,

Is this normal

[root@ps01-em1 PSREMOTE]# systemctl status cassandra.service
● cassandra.service - SYSV: Starts and stops Cassandra
   Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
   Active: active (exited) since Tue 2021-07-06 11:50:29 BST; 8h ago
     Docs: man:systemd-sysv-generator(8)
  Process: 25742 ExecStop=/etc/rc.d/init.d/cassandra stop (code=exited, status=1/FAILURE)
  Process: 25811 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited, status=0/SUCCESS)

Jul 06 11:50:28 ps01-em1.rl.ac.uk systemd[1]: cassandra.service: control process exited, code=exited status=1
Jul 06 11:50:28 ps01-em1.rl.ac.uk systemd[1]: Stopped SYSV: Starts and stops Cassandra.
Jul 06 11:50:28 ps01-em1.rl.ac.uk systemd[1]: Unit cassandra.service entered failed state.
Jul 06 11:50:28 ps01-em1.rl.ac.uk systemd[1]: cassandra.service failed.
Jul 06 11:50:28 ps01-em1.rl.ac.uk systemd[1]: Starting SYSV: Starts and stops Cassandra...
Jul 06 11:50:28 ps01-em1.rl.ac.uk su[25820]: (to cassandra) root on none
Jul 06 11:50:29 ps01-em1.rl.ac.uk cassandra[25811]: Starting Cassandra: OK
Jul 06 11:50:29 ps01-em1.rl.ac.uk systemd[1]: Started SYSV: Starts and stops Cassandra.


I assume it is.

Regards, Raul

From: Mark Feit <>
Sent: 06 July 2021 15:06
To: Raul Lopes <>; <>
Subject: Re: Node not reporting to Maddash
 

Raul Lopes writes:

 

pscheduler result doesn't give any hint of error.

 

Archivings:

 

  To esmond, Finished

    2021-07-06T12:52:52+01:00 Succeeded

 

That being the case, the Esmond archiver believes it successfully handed the data over to Esmond and there’s an internal problem.  That’s definitely Andy territory.

 

--Mark

 


Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under company number. 05747339, VAT number GB 197 0632 86. Jisc’s registered office is: 4 Portwall Lane, Bristol, BS1 6NB. T 0203 697 5800.

Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 02881024, VAT number GB 197 0632 86. The registered office is: 4 Portwall Lane, Bristol, BS1 6NB. T 0203 697 5800.

Jisc Commercial Limited is a wholly owned Jisc subsidiary and a company limited by shares which is registered in England under company number 09316933, VAT number GB 197 0632 86. The registered office is: 4 Portwall Lane, Bristol, BS1 6NB. T 0203 697 5800.

For more details on how Jisc handles your data see our privacy notice here: https://www.jisc.ac.uk/website/privacy-notice




Archive powered by MHonArc 2.6.24.

Top of Page