Nodetool is a OOTB tool from Cassandra. It provides various options to manage the keyspace. There were three useful options I came across:
1) I had to copy the keyspace from one server to another server. Since the source is from multiple servers, it is essential to cleanup the data after all the db and its related files are copied. For that, I have to use nodetool. Nodetool usually resides in <cassandra-path>/bin. To cleanup, give the following command:
nodetool -h <hostname of the cassandra> cleanup
Based on the amount of data to be cleaned, it will take some time.To track the progress, you can use:
nodetool compactionstats
This will print the following:
pending tasks: 1
compaction type keyspace column family completed total unit progress
Cleanup <keyspace-name> <column-family-name> 1667500496 2004577146 bytes 83.18%
1) I had to copy the keyspace from one server to another server. Since the source is from multiple servers, it is essential to cleanup the data after all the db and its related files are copied. For that, I have to use nodetool. Nodetool usually resides in <cassandra-path>/bin. To cleanup, give the following command:
nodetool -h <hostname of the cassandra> cleanup
Based on the amount of data to be cleaned, it will take some time.To track the progress, you can use:
nodetool compactionstats
This will print the following:
pending tasks: 1
compaction type keyspace column family completed total unit progress
Cleanup <keyspace-name> <column-family-name> 1667500496 2004577146 bytes 83.18%
Note: the total column will vary based on the size that is being calculated in real time.
2) To get the statistics of all the keyspaces. Especially to get a rough idea of how much keys per column family are being used:
nodetool -h <hostname> cfstats