---
id: 29ec17ec-87a0-46ce-99e4-ffc8dabb3bdc
---

# Cassandra Tombstone Dump Incident
---

A Cassandra tombstone dump incident refers to a situation in which a database table in Cassandra has too many tombstones (deleted data markers), causing performance issues and potentially leading to data loss. This type of incident requires immediate attention from a software engineer as it can negatively impact the overall system's stability and availability. The incident may be caused by a variety of factors, such as a misconfigured garbage collector or an application that is generating too many tombstones.

### Parameters
```shell
# Environment Variables

export KEYSPACE="PLACEHOLDER"

export TABLE="PLACEHOLDER"

export PARAMETER="PLACEHOLDER"

export GC_NAME="PLACEHOLDER"

export GC_OPTIONS="PLACEHOLDER"

export PATH_TO_CASSANDRA_HOME_DIRECTORY="PLACEHOLDER"

```

## Debug

### Check Cassandra's status
```shell
nodetool status
```

### Check for any errors in the Cassandra system log
```shell
sudo tail -f /var/log/cassandra/system.log | grep ERROR
```

### Check if any tombstone threshold has been exceeded
```shell
nodetool tablestats ${KEYSPACE}.${TABLE} | grep "Tombstone cells"
```

### Check the number of tombstones per partition
```shell
nodetool cfstats ${KEYSPACE}.${TABLE} | grep tombstones
```

### Check the size of the tombstone files on disk
```shell
sudo find /var/lib/cassandra/data/${KEYSPACE}/${TABLE} -name "*-Data.db" -exec ls -lh {} \; | awk '{print $5, $9}'
```

### Check the garbage collector logs for any errors
```shell
sudo tail -f /var/log/cassandra/gc.log | grep ERROR
```

### Check Cassandra's configuration file for any misconfigurations
```shell
cat /etc/cassandra/cassandra.yaml | grep ${PARAMETER}
```

### Check if any nodes in the Cassandra cluster are down
```shell
nodetool status | grep DN
```

### Misconfigured garbage collector: If the garbage collector in Cassandra is misconfigured, it may not be cleaning up tombstones effectively, leading to an accumulation of tombstones that can impact performance and stability.
```shell


#!/bin/bash



# Set the Cassandra home directory path

CASSANDRA_HOME="PLACEHOLDER"



# Check the garbage collector settings in the Cassandra configuration file

gc_type=$(grep "^#.*-XX:.*GC" $CASSANDRA_HOME/conf/cassandra-env.sh | awk '{print $2}')

if [[ "$gc_type" == "-XX:+UseG1GC" ]]; then

  echo "The garbage collector is set to G1GC, which is recommended for Cassandra."

else

  echo "The garbage collector is set to $gc_type, which may not be optimal for Cassandra."

fi



# Check the garbage collector log for any issues

gc_log=$(grep "^#.*-Xloggc" $CASSANDRA_HOME/conf/cassandra-env.sh | awk '{print $2}')

if [[ -f "$gc_log" ]]; then

  last_gc=$(grep "Full GC" $gc_log | tail -1)

  if [[ -n "$last_gc" ]]; then

    echo "The garbage collector logged a Full GC event at: $last_gc"

  else

    echo "No Full GC events were logged by the garbage collector."

  fi

else

  echo "The garbage collector log file could not be found."

fi


```

## Repair

### Set the path to the Cassandra configuration file
```shell
CASSANDRA_CONF="PLACEHOLDER"
```

### Set the name of the garbage collector to use
```shell
GC_NAME=${GC_NAME}
```

### Set the options for the garbage collector
```shell
GC_OPTIONS=${GC_OPTIONS}
```

### Backup the original configuration file
```shell
cp $CASSANDRA_CONF $CASSANDRA_CONF.orig
```

### Modify the garbage collector settings in the configuration file
```shell
sed -i "s/-XX:+UseG1GC/-XX:+Use$GC_NAME $GC_OPTIONS/g" $CASSANDRA_CONF
```

### Restart the Cassandra service to apply the changes
```shell
systemctl restart cassandra
```

A Cassandra tombstone dump incident refers to a situation in which a database table in Cassandra has too many tombstones (deleted data markers), causing performance issues and potentially leading to data loss. This type of incident requires immediate attention from a software engineer as it can negatively impact the overall system's stability and availability. The incident may be caused by a variety of factors, such as a misconfigured garbage collector or an application that is generating too many tombstones.


This incident type refers to a situation where there is a significant delay in the execution of queries on a Cassandra cluster. This delay can cause the system to become unresponsive and result in slower performance. It may be caused by a variety of factors such as an increase in traffic, inefficient queries, or hardware issues. The issue can impact the functionality of the system and requires immediate attention to prevent further disruption.


Slow Query Performance on Cassandra Cluster.

In this incident type, there is an issue with a Cassandra cluster where one or more disks are running slow. This can cause performance issues and potentially lead to data loss or downtime. The goal is to identify and address the specific disk(s) causing the problem in order to restore normal cluster operations.


Slow Disk in Cassandra Cluster

This incident type refers to a performance degradation issue in a Cassandra database caused by the new generation garbage collection (GC) process. Garbage collection is a process by which the unused memory occupied by objects in a program is cleared. In some cases, the new generation GC process may take longer than usual, causing performance issues in the Cassandra database. This could lead to slowness in queries or even crashing of the database.


Performance Degradation Due to New Generation Garbage Collection on Cassandra

Misconfigured compaction strategy is an incident type that occurs when the way data is compacted within a database cluster is not properly configured. This can lead to excessive compaction activity that can negatively impact the cluster's performance and stability. The incident can cause slow query response times, increased disk usage, and high CPU utilization, among other issues. It is crucial to identify and fix the misconfiguration promptly to ensure the cluster's smooth operation.


Misconfigured Compaction Strategy.

This incident type refers to a situation where a high number of mutations are being dropped on a Cassandra database. Mutations are changes made to the database, such as inserting new data or updating existing data. When mutations are dropped, it means that they were not successfully recorded in the database. This can be caused by a variety of factors, such as hardware or network issues, configuration problems, or bugs in the software. When this occurs, it can result in data inconsistencies or loss, and can impact the performance and reliability of the application that relies on the database.


High Number of Dropped Mutations on Cassandra Database

```shell
# Environment Variables

export KEYSPACE="PLACEHOLDER"

export TABLE="PLACEHOLDER"

export PARAMETER="PLACEHOLDER"

export GC_NAME="PLACEHOLDER"

export GC_OPTIONS="PLACEHOLDER"

export PATH_TO_CASSANDRA_HOME_DIRECTORY="PLACEHOLDER"

```


### Check Cassandra's status

```shell
nodetool status
```

### Check for any errors in the Cassandra system log

```shell
sudo tail -f /var/log/cassandra/system.log | grep ERROR
```

### Check if any tombstone threshold has been exceeded

```shell
nodetool tablestats ${KEYSPACE}.${TABLE} | grep "Tombstone cells"
```

### Check the number of tombstones per partition

```shell
nodetool cfstats ${KEYSPACE}.${TABLE} | grep tombstones
```

### Check the size of the tombstone files on disk

```shell
sudo find /var/lib/cassandra/data/${KEYSPACE}/${TABLE} -name "*-Data.db" -exec ls -lh {} \; | awk '{print $5, $9}'
```

### Check the garbage collector logs for any errors

```shell
sudo tail -f /var/log/cassandra/gc.log | grep ERROR
```

### Check Cassandra's configuration file for any misconfigurations

```shell
cat /etc/cassandra/cassandra.yaml | grep ${PARAMETER}
```

### Check if any nodes in the Cassandra cluster are down

```shell
nodetool status | grep DN
```

### Misconfigured garbage collector: If the garbage collector in Cassandra is misconfigured, it may not be cleaning up tombstones effectively, leading to an accumulation of tombstones that can impact performance and stability.

```shell


#!/bin/bash



# Set the Cassandra home directory path

CASSANDRA_HOME="PLACEHOLDER"



# Check the garbage collector settings in the Cassandra configuration file

gc_type=$(grep "^#.*-XX:.*GC" $CASSANDRA_HOME/conf/cassandra-env.sh | awk '{print $2}')

if [[ "$gc_type" == "-XX:+UseG1GC" ]]; then

  echo "The garbage collector is set to G1GC, which is recommended for Cassandra."

else

  echo "The garbage collector is set to $gc_type, which may not be optimal for Cassandra."

fi



# Check the garbage collector log for any issues

gc_log=$(grep "^#.*-Xloggc" $CASSANDRA_HOME/conf/cassandra-env.sh | awk '{print $2}')

if [[ -f "$gc_log" ]]; then

  last_gc=$(grep "Full GC" $gc_log | tail -1)

  if [[ -n "$last_gc" ]]; then

    echo "The garbage collector logged a Full GC event at: $last_gc"

  else

    echo "No Full GC events were logged by the garbage collector."

  fi

else

  echo "The garbage collector log file could not be found."

fi


```


### Set the path to the Cassandra configuration file

```shell
CASSANDRA_CONF="PLACEHOLDER"
```

### Set the name of the garbage collector to use

```shell
GC_NAME=${GC_NAME}
```

### Set the options for the garbage collector

```shell
GC_OPTIONS=${GC_OPTIONS}
```

### Backup the original configuration file

```shell
cp $CASSANDRA_CONF $CASSANDRA_CONF.orig
```

### Modify the garbage collector settings in the configuration file

```shell
sed -i "s/-XX:+UseG1GC/-XX:+Use$GC_NAME $GC_OPTIONS/g" $CASSANDRA_CONF
```

### Restart the Cassandra service to apply the changes

```shell
systemctl restart cassandra
```


Cassandra Tombstone Dump Incident

Overview

Parameters

Debug

Check Cassandra's status

Check for any errors in the Cassandra system log

Check if any tombstone threshold has been exceeded

Check the number of tombstones per partition

Check the size of the tombstone files on disk

Check the garbage collector logs for any errors

Check Cassandra's configuration file for any misconfigurations

Check if any nodes in the Cassandra cluster are down

Misconfigured garbage collector: If the garbage collector in Cassandra is misconfigured, it may not be cleaning up tombstones effectively, leading to an accumulation of tombstones that can impact performance and stability.

Repair

Set the path to the Cassandra configuration file

Set the name of the garbage collector to use

Set the options for the garbage collector

Backup the original configuration file

Modify the garbage collector settings in the configuration file

Restart the Cassandra service to apply the changes

Learn more

Related Runbooks

Slow Query Performance on Cassandra Cluster.

Slow Disk in Cassandra Cluster

Performance Degradation Due to New Generation Garbage Collection on Cassandra

Misconfigured Compaction Strategy.

Support