While working with Cassandra we found that sometimes snapshots get left around after repairs fail or have issues. These can cause 100’s of Gigs of space to just be wasted. This can add up quickly and cause issues. On a node with low disk space we can check for stale snapshots the following way.
If you see large or multiple directories under columnFamily/snapshots/ that may indicate there are stale snapshots that can be cleaned up.
We will want to check when these snapshots where created so inside the snapshots directory of the column family run
Each of those are snapshots but you will notice 4f37dcd0-c7fa-11e4-b5ae-5f969a9b23c8 is stale, we can now clean that up with the nodetool clearsnapshot command.
That will take care of removing the stale snapshot you should see disk space recovered immediately.
You can also look in the logs for failed repairs we see something like the following:
In that case the bad snapshot would be 72e69720-b959-11e4-9b55-39152d07d3bf.