A Cassandra tombstone dump incident refers to a situation in which a database table in Cassandra has too many tombstones (deleted data markers), causing performance issues and potentially leading to data loss. This type of incident requires immediate attention from a software engineer as it can negatively impact the overall system's stability and availability. The incident may be caused by a variety of factors, such as a misconfigured garbage collector or an application that is generating too many tombstones.
Parameters
Debug
Check Cassandra's status
Check for any errors in the Cassandra system log
Check if any tombstone threshold has been exceeded
Check the number of tombstones per partition
Check the size of the tombstone files on disk
Check the garbage collector logs for any errors
Check Cassandra's configuration file for any misconfigurations
Check if any nodes in the Cassandra cluster are down
Misconfigured garbage collector: If the garbage collector in Cassandra is misconfigured, it may not be cleaning up tombstones effectively, leading to an accumulation of tombstones that can impact performance and stability.
Repair
Set the path to the Cassandra configuration file
Set the name of the garbage collector to use
Set the options for the garbage collector
Backup the original configuration file
Modify the garbage collector settings in the configuration file
Restart the Cassandra service to apply the changes
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.