Skip to Content.
Sympa Menu

perfsonar-user - [perfsonar-user] Re: Trying to debug - Test Results (No Results)

Subject: perfSONAR User Q&A and Other Discussion

List archive

[perfsonar-user] Re: Trying to debug - Test Results (No Results)


Chronological Thread 
  • From: Brian Candler <>
  • To: "" <>
  • Subject: [perfsonar-user] Re: Trying to debug - Test Results (No Results)
  • Date: Fri, 14 Dec 2018 09:43:27 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=pobox.com; h=subject:from:to :references:message-id:date:mime-version:in-reply-to :content-type; q=dns; s=sasl; b=eQWbgjmvzUIIySYF/sP8T/Vw6Fsdth1K vhAS7ucIEYqva73yxWcx0gkkxfnxwAvhifAkF2Egu7d/QOhf0znpB6iO8k53SB1a aadD56DSjZ5YOGWUXIGXUuMVTU4noTGFMb6FsJ1Gf4FEC3NDkLFZbtMbBiNoNvY5 CUqlHGMmx/A=
  • Ironport-phdr: 9a23:a8pf8xHMj8dkPI+Lm9SBPJ1GYnF86YWxBRYc798ds5kLTJ7zo8WwAkXT6L1XgUPTWs2DsrQY07qQ6/iocFdDyK7JiGoFfp1IWk1NouQttCtkPvS4D1bmJuXhdS0wEZcKflZk+3amLRodQ56mNBXdrXKo8DEdBAj0OxZrKeTpAI7SiNm82/yv95HJbAhEmDmwbaluIBmqsA7cqtQYjYx+J6gr1xDHuGFIe+NYxWNpIVKcgRPx7dqu8ZBg7ipdpesv+9ZPXqvmcas4S6dYDCk9PGAu+MLrrxjDQhCR6XYaT24bjwBHAwnB7BH9Q5fxri73vfdz1SWGIcH7S60/VDK/5KlpVRDokj8KOSM5/m/JhMx+j6xVrxyuqBN934HZe5uaOOZkc67HYd8XS2hMU8BMXCJBGIO8aI4PAvIcMOZCronyvV0OpgagCAa2AuPg1ztIiWHs3aYn1OkuChvK0xA6ENIIrXvUqMv6NL0JXOCty6nH1jLDbvxM1Tjh74jIdwksrPeRVrx+dsrRzFMgFwLDjliIsIzlIjSV1v4XvGSB8+VgUuevh3YoqwF2uTiv39oshZPTho0L11/I7zl2wIEwJdC+VUV1YsakHYNNuyyYOYZ6WN0uT39rtSog17ELuZy2cDIKxZkp3xLTdvyKfoeS7h7+SuqdPS10iXJ/dL6hhxu//k6twfDmWMauylZFtC9Fn8HMtn8T0xzT7dCKSuNm8Uu4wjaP0hzT6vlaLUwpj6bbM5khzaU3lpscq0jMAij2mEDugK+XcEUr5PSo5vz6brjkqJKQLZF4hh/9P6g0h8CyAeY1PhIOUmWV4ei80afs/Uz9QLVElP02lazZvYjBKsgBuqG2GQlV3Zsn6xmhFTery8wYnX4cI1JCdxOLlZTmO1bLIPzgF/ewn0yskCt3x/DBJrDhGovCLmLNkLf6erZ97VRTyBAqwdBC/JJbFKsBLen3Wk/wr9zYEgQ5PxKuz+bmDtV9yp0RWXiJAqCHLKPer0WE6fwyLOmRN8cpv2O3M/U/6eXpi3Yj3EIGcLOB3J0LZWq+E+g8ZUiVfDCk1s8MC2kRuQw3Vqn3k1CYeT9Ve3uoWa8gvHc2BJ/wXqnZQYX4qqaA1SHzNZRQamRLEBjYGzHjfoOIX/oWQDqVKMhx1DcDUO7yGMcayRiyuVqimPJcJe3O93hAuA==

On 14/12/2018 09:16, Brian Candler wrote:
Now the problem is clear: cassandra database is corrupt, and it fails to start.  How to fix this, short of reinstalling the whole node, is an issue which can be dealt with separately.

FYI, I found a this guide as a starting point.

# systemctl stop cassandra
# nodetool scrub
Failed to connect to '127.0.0.1:7199': Connection refused (Connection refused)
# sstablescrub
Missing arguments
usage: sstablescrub [options] <keyspace> <column_family>
...

And I learned a bit about where cassandra stores keyspaces and column families.  So my first attempt was to clean the esmond databases:

# ls /var/lib/cassandra/data/
esmond  system  system_traces
# ls /var/lib/cassandra/data/esmond/
base_rates  rate_aggregations  raw_data  stat_aggregations[root@perfsonar-border cassandra]
# sstablescrub esmond base_rates
... etc

These were fine - nothing needed fixing, but cassandra still did not start.

Looking at the log message more carefully, it's "system local" which is corrupt.

When I try to scrub this, it goes into an infinite loop:

# sstablescrub system local 2>&1 | head -50
ERROR 09:32:47,962 Unable to initialize MemoryMeter (jamm not specified as javaagent).  This means Cassandra will be unable to measure object sizes accurately and may consequently OOM.
Pre-scrub sstables snapshotted into snapshot pre-scrub-1544779968622
Scrubbing SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-41-Data.db') (11956 bytes)
Scrub of SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-41-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
Scrubbing SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-42-Data.db') (12600 bytes)
Scrub of SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-42-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
Scrubbing SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-40-Data.db') (156 bytes)
WARNING: Error reading row (stacktrace follows):
Retrying from row index; data is 149 bytes starting at 7
WARNING: Retry failed too. Skipping to next row (retry's stacktrace follows)
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
WARNING: Row starting at position 7 is unreadable; skipping to next
WARNING: Error reading row (stacktrace follows):
... etc

I get the same with "-s" or "-n" flags.

Since I don't have any replicas to recover data from, I'm a bit stuck now.

Can I just uninstall and reinstall the cassandra package - or will this leave me with missing databases which won't be recreated automatically?  In which case, is a full node reinstall required here?

Thanks,

Brian.




Archive powered by MHonArc 2.6.19.

Top of Page