Cassandra instantaneous in place node replacement

At some point everyone using Cassandra faces the situation of having to replace nodes. Either because the cluster needs to scale and some nodes are too small or because a node has failed or because your virtualisation provider is going to remove it.

Whichever the reason the situation we face is that a new node has to be streamed in, to hold the exact same data the old one had. Why do we need to wait for a whole streaming process, with the network and CPU overhead this requires when we could just copy the data into the new node and have it join the ring replacing the old one?

That’s what we, at MyDrive have been doing for a while and we want to share the exact process we follow with the community shall it help someone.


They main idea behind this process is to have the replacement node up and running as quick as possible by cutting down the process where it takes longer, streaming data.

The key points of the process are:

  1. Data will be copied from the old node to the new one using an external volume instead of transmitting it through the network.
  2. The new node will be given the exact same configuration as the replaced one. Therefore, the replacement node will be responsible for the same tokens as the replaced one, and will also have the same Host-ID, so, when it joins the ring, the other nodes won’t even notice the difference!

All our infrastructure is in AWS, therefore, we used EBS volumes to backup and restore cassandra data. You may use a different data transfer method which suits you better in your infrastructure.


  1. Setup the new node, paying special attention to the following configuration parameters:
    1. listen_address
    2. rpc_address
    3. seeds
  2. Create the external volume you’re going to use to transfer the data from the old node to the new one.
  3. Rsync data and commitlog directories to the external volume
    1. Mount the external volume into the old node in /mnt/backup
    2. Copy the Cassandra data directory into the volume: rsync -av --progress --delete /var/lib/cassandra/data /mnt/backup/data
    3. Copy the Cassandra commitlog directory into the volume: rsync -av --progress --delete /var/lib/cassandra/commitlog /mnt/backup/commitlog
    4. Unmount and disconnect the volume. Connect and mount it into the replacement node.
    5. Copy the Cassandra data directory: rsync -av --progress --delete /mnt/backup/data /var/lib/cassandra/data
    6. Copy the Cassandra commitlog: rsync -av --progress --delete /mnt/backup/commitlog /var/lib/cassandra/commitlog
  4. Drain the old node: nodetool drain
  5. Stop Cassandra in the old node: sudo service cassandra stop
    1. And make sure it doesn’t accidentally come back (i.e. if you’re running chef, supervisor or any other tool that may restart it automatically). This is EXTREMELY important, as if the replacement node tries to join the ring when the old one is alive, the new host will be assigned a new Host ID and the ring will be rebalanced as if we were adding a new node instead of replacing one.
  6. Do a final rsync. This one is to catch any last changes. (Repeat all steps from step 3)
  7. Ensure Cassandra data and commitlog folders are owned by the cassandra user (rsync copies the owner’s UID along with the data and that UID may not be the appropriate in the new machine).
  8. Start the new node. sudo service cassandra start
  9. Check that everything is working properly:
    1. In the replacement’s logs you should see a message like this: WARN <time> Not updating host ID <host ID> for /<replaced node IP address> because it's mine Indicating that the new node is replacing the old one.
    2. In the replacement’s logs you should also see one message like the following per token: INFO <time> Nodes /<old IP address> and /<new IP address> have the same token <a token>. Ignoring /<old IP address> Indicating that the new node is becoming primary owner of the replaced’s tokens.
    3. In the other nodes’ logs you should see a message like: Host ID collision for <Host ID> between /<replaced IP address> and /<replacement IP address>; /<replacement IP address> is the new owner Indicating that the other nodes acknowledge the change
    4. nodetool [status] should show the new node’s IP owning the replaced Host ID and the old one shouldn’t appear anymore.
    5. Everything should look normal.
  10. Update other nodes’ seeds list if the replaced node was a seed one.
  11. You can now safely destroy you old machine.

And voilà! By following these steps carefully you will be able to replace nodes and have them running quickly, avoiding any tokens movement or streaming.

2 thoughts on “Cassandra instantaneous in place node replacement

  1. Carlos, thanks for sharing your knowledge in this indeed good article!
    There is a bit older version of this article published on MyDrive Eng blog: , with some differences in how token-related stuff is handled. Which modification of two above methods is better for C* cluster version 2.1.x with vnodes and what’s the reason the method was modified? As for me the method described here is better, but maybe it has some drawbacks comparing to another one?

    Also, as far as I understand, there is one important thing to mention: old and new nodes must be located in the same rack [in terms of C*], in case some rack-aware snitch is used, otherwise we’ll get violation of replica placement order.


    • Hi Kryrill!

      Glad this post served you!! This post describes the refined procedure from the one at MyDrive Solutions’ blog’s one. This one is simpler and works the same as MyDrive’s one, but both are accurate.

      About the Rack’s placement you’re totally right, I just didn’t added it here or in MyDrive’s one because in both situations we were using EC2Snitch and as long as the instances are on the same AZ it falls on the same rack.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s