Deploy your Bitnami Cassandra Stack on Microsoft Azure Multi-Tier Solutions now! Launch Now

Bitnami Cassandra for Microsoft Azure Multi-Tier Solutions

Description

Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many servers, providing high availability with no single point of failure.

What are the differences between a Bitnami Single-Tier Solution and Multi-Tier Solution?

Single-tier architecture implies that all the required components of an application run on a single server. If your environment is growing and becoming more complex, a single layer architecture will not meet your scalability requirements. Single-Tier Solutions are great for departmental applications, smaller production environments, new users, or those applications that don't support multi-tier architectures.

The typical architecture of a Bitnami Single-Tier Solution looks like this:

Single-tier architecture

Multi-tier architecture involves more than one server and infrastructure resource. For example, the Front End-Database topology separates the application server from the database server. This allows you to extend workloads in the cloud and tailor your application to meet specific scalability and reliability goals. Multi-Tier Solutions provide more sophisticated deployment topologies for improved scalability and reliability for larger production or mission critical environments.

TIP: Not sure if you have chosen the right solution? Check out the Bitnami Multi-Tier solutions features and benefits to learn more about the benefits of Multi-Tier.

This Bitnami Multi-Tier Solution stores and replicates data across a configurable number of Cassandra nodes with full replication support. This topology is illustrated below:

Multi-tier architecture

First steps with the Bitnami Cassandra Stack

Welcome to your new Bitnami application running on Microsoft Azure Multi-Tier Solutions! Here are a few questions (and answers!) you might need when first starting with your application.

What credentials do I need?

You need two sets of credentials:

  • The application credentials that allow you to log in to your new Bitnami application. These credentials consist of a username and password.
  • The server credentials that allow you to log in to your Microsoft Azure Multi-Tier Solutions server using an SSH client and execute commands on the server using the command line. These credentials consist of an SSH username and key.

What is the administrator username set for me to log in to the application for the first time?

Username: cassandra

What SSH username should I use for secure shell access to my application?

SSH username: bitnami

How to start or stop the services?

NOTE: The steps below require you to execute the commands on the remote server. Please check our FAQ for instructions on how to connect to your server through SSH.

Each Bitnami server includes a control script that lets you easily stop, start and restart all the services installed on the current individual server.

Obtain the status of a service with the service bitnami status command:

$ sudo service bitnami status

Use the service bitnami command to start, stop or restart all the services in a similar manner:

  • Start all the services.

    $ sudo service bitnami start
    
  • Stop all the services.

    $ sudo service bitnami stop
    
  • Restart all the services.

    $ sudo service bitnami restart
    
TIP: To start, restart or stop individually each server of the cluster, check the FAQ section about how to start or stop servers in a Multi-Tier Solution.

What is the default configuration?

The grant tables define the initial Cassandra user accounts and their access privileges. The default configuration consists of one privileged account with a username of cassandra. It has the following privileges:

  • Has all privileges including remote access to the database.
  • Can create new users and assign them different roles.

To check the list of users and their privileges, execute the following command in the database:

cassandra@cqlsh> LIST USERS;

name      | super
-----------+-------
cassandra |  True

Cassandra version

In order to see which Cassandra version your system is running, execute the following command:

cqlsh --version

Cassandra configuration file

The Cassandra configuration file is located at /opt/bitnami/cassandra/conf/cassandra.yaml.

The Cassandra official documentation has more details about how to configure the Cassandra database.

Cassandra ports

The default ports for Cassandra are:

  • Client port: 9042.
  • Transport port: 7000.
  • JMX port: 7199.

Cassandra Process Identification Number

The Cassandra .pid file allows other programs to find out the PID (Process Identification Number) of a running script. Find it at /opt/bitnami/cassandra/tmp/cassandra.pid.

Cassandra log file

The Cassandra .log file contains a record with all the events that occur while the database is running in the server. Find it at /opt/bitnami/cassandra/logs/cassandra.log.

Cassandra debug log file

The Cassandra debug.log file contains status and error messages useful for debugging the server. Find it at /opt/bitnami/cassandra/logs/debug.log.

How is the cluster configured?

The Bitnami Multi-Tier Solution for Cassandra uses multiple VMs in a ring topology.

It has a masterless architecture which means that any node can accept any request (read, write, or delete) and route it to the correct node even if the data is not stored in that node.

The cluster uses GossipingPropertyFileSnitch as a snitch, which tells Cassandra how its network looks like.

The Cassandra server is configured to listen for connections from any IP address (0.0.0.0).

How to check cluster replication status?

You can obtain information about the cluster such as the state, the load rate of each node and IDs. Connect to one of the nodes of your cluster and run the following command:

$ nodetool status

This is an example of the output from running the nodetool status command:

datacenter: datacenter1
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.6.66   335.61 KiB  256          39.6%            73d7741f-40a8-4f8f-8f49-073ce37e2c23  rack1
UN  10.0.4.215  199.95 KiB  256          41.2%            0fa77708-f5a7-4160-93ac-09944fd4c66c  rack1
UN  10.0.7.42   227.53 KiB  256          41.7%            cd6c98b5-1551-4dff-8fa6-feeb11da32ed  rack1
  • UN= Up and Normal.
  • UJ= Up and Joining.

The replication is automatic between nodes. Check the Cassandra official documentation to know more about replication.

Users can define how many replicas are needed, and Cassandra handles replica creation and management transparently. To understand how it works, consider the example below of a two-node cluster (pick a single node in the cluster on which to perform the initial configuration):

  • Node 1, creating the data:

    • Create a keyspace and verify that it has been added, then change to that keyspace:

      CREATE KEYSPACE bntest WITH REPLICATION = {
       'class': 'SimpleStrategy',
       'replication_factor' : 2
      };
      DESCRIBE KEYSPACES;
      USE bntest;
      
      NOTE: In this example we have used the replication_factor in order to copy the data to a minimum of 2 nodes.
    • Create a table and check that it has been created correctly:

      CREATE TABLE bntest.bntable (
         key text PRIMARY KEY,
         values set<text>);
      DESCRIBE TABLES;
      
    • Insert some sample data in the table and check it:

      INSERT INTO bntest.bntable (key, values) VALUES ('foo', {'bar'});
      SELECT key, values FROM bntest.bntable WHERE key = 'foo';
      
  • Node 2, checking the replication:

    • Execute the following commands to check that you obtain an identical output as in node 1:

      DESCRIBE KEYSPACES;
      USE bntest;
      DESCRIBE TABLES;
      SELECT key, values FROM bntest.bntable WHERE key = 'foo';
      
    • Delete the data in node 1 and verify that it was removed:

      USE bntest;
      DELETE FROM bntest.bntable WHERE key = 'foo';
      SELECT key, values FROM bntest.bntable WHERE key = 'foo';
      DROP TABLE bntest.bntable;
      DESCRIBE TABLES;
      DROP KEYSPACE bntest;
      DESCRIBE KEYSPACES;
      
    • Check that the data has been also deleted in node 2:

      USE bntest;
      SELECT key, values FROM bntest.bntable WHERE key = 'foo';
      DESCRIBE TABLES;
      DESCRIBE KEYSPACES;
      

How to connect to cluster nodes?

Some operations such as changing the application password, require that some actions will be repeated in each cluster node. That way, you need to connect to each node for the changes to take effect in the whole cluster. Follow the steps below to connect the cluster nodes in your Azure deployments:

  • Log in to the Microsoft Azure portal.
  • Navigate to the "Virtual Machines" section and find your deployment.
  • Select the primary node from the virtual machines list. It usually finishes with the number 0:

    Select the primary node

  • In the resulting screen, click "Connect". It displays the command to connect through SSH to the selected node:

    Connect through SSH to the primary node

  • Open a new terminal window on your local system and paste the command shown above. You will be prompted to enter your password. After this, you should be connected to the primary node as shown below:

    Connect through SSH to the primary node

Once you have connected to the primary node, you are able to connect to the rest of the nodes establishing an SSH connection to each node IP address as follows:

  • To find the private IP address of a node, select it in the list of virtual machines and click "network/default-subnet".

    Find node private IP address

  • Copy the IP address of the node you want to connect.

    Copy node IP address

  • In the terminal window, execute the following command (within the primary node). Remember to replace the NODE_IP_ADDRESS placeholder with the correct value:

    $ ssh bitnami@NODE_IP_ADDRESS
    

    Connect to a secondary node

    NOTE: Remember to repeat the same operation to connect to each cluster node.

How to add nodes to the cluster?

IMPORTANT: These steps assume that you have already installed the Microsoft Azure command-line client (Microsoft Azure CLI) on your system and you are signed in to Microsoft Azure through it. If this is not the case, please refer to the FAQ for instructions on how to install and sign in to Microsoft Azure using the Azure CLI.
NOTE: To follow the steps below, you will need the subscription ID, deployment ID and resource group ID for the deployment to which you wish to add nodes. Find out how to obtain the subscription ID and the deployment and resource group IDs.

To add more Cassandra nodes, follow these steps:

  • Set the subscription ID for your deployment in the Azure CLI with the command below. Replace the SUBSCRIPTION-ID placeholder with the correct value.

      $ az account set --subscription SUBSCRIPTION-ID
    
  • Download the deployment template associated with your deployment using the command below. Replace the DEPLOYMENT-ID and RESOURCE-GROUP-ID placeholders with the correct values.

      $ az group deployment export --name DEPLOYMENT-ID --resource-group RESOURCE-GROUP-ID > template.json
    
  • Download the parameters file associated with your deployment using the command below. Replace the DEPLOYMENT-ID and RESOURCE-GROUP-ID placeholders with the correct values.

      $ az group deployment show --name DEPLOYMENT_ID --resource-group RESOURCE_GROUP_ID --query "properties.parameters" | sed '/"type":/d' > parameters.json
    
  • Redeploy the solution with the additional node(s) using the command below. Replace the DEPLOYMENT-ID and RESOURCE-GROUP-ID placeholders with the correct values, the ADMIN-PASSWORD and APP-PASSWORD placeholders with the same database administration password and application password used when initially deploying the application, and the NUMBER-OF-NODES placeholder with the final number of nodes you wish to have in the cluster.

      $ az group deployment create --name DEPLOYMENT-ID --resource-group RESOURCE-GROUP-ID --template-file template.json --parameters @parameters.json adminPassword=ADMIN-PASSWORD appPassword=APP-PASSWORD '{"nodeCount": {"value": NUMBER-OF-NODES}}'
    

Verify that the new node(s) have been added successfully by logging in to the Azure portal and selecting the resource group and deployment to check the number of running nodes. Once you have confirmed that the new node(s) have been added successfully, log in to the primary node and verify that the new node(s) are now part of the cluster by following these instructions.

How to create a Virtual Network peering?

To connect two instances internally you can enable a Virtual Network (VNet) peering from the Azure Portal. Depending if the instances were launched in the same or in different resource groups, there are two methods for performing a internal connection: sharing a virtual network or enabling a virtual network peering.

How to connect to Cassandra from a different machine?

IMPORTANT: By default, the database port for the nodes in this solution cannot be accessed over a public IP address. As a result, you will only be able to connect to your database nodes from machines that are running in the same network. For security reasons, we do not recommend making the database port accessible over a public IP address. If you must make it accessible over a public IP address, we recommend restricting access to a trusted list of source IP addresses using firewall rules. Refer to the FAQ for information on opening ports in the server firewall.

Connecting to Cassandra from the same network

Run the following command to connect to Cassandra from a different machine (Remember that IP_SERVER, USER and PASSWORD are placeholders. Replace these values with the right ones):

$ cqlsh IP_SERVER -u USER -p PASSWORD

Connecting to Cassandra from a different network

If you must connect to the application from a machine that is not running in the same network as the Cassandra cluster, you can follow these approaches (these are shown in order of preference, from the most secure to the less recommended solution):

NOTE: You should only access using an SSH tunnel if you wish to temporarily connect to, or use, the Cassandra console. This approach is not recommended to permanently connect your application to the Cassandra cluster, as a connectivity failure in the SSH tunnel would affect your application's functionality.
  • Option 3: Make the server publicly accessible and restrict access to a trusted list of source IP addresses using firewall rules. Refer to the FAQ for information on opening ports in the server firewall. Once you have opened the port, execute the following to access Cassandra server:

     $ cqlsh IP_SERVER PORT -u USER -p PASSWORD
    

Learn more about Cassandra ports in the Cassandra official documentation.

How to create a Virtual Network peering?

To connect two instances internally you can enable a Virtual Network (VNet) peering from the Azure Portal. Depending if the instances were launched in the same or in different resource groups, there are two methods for performing a internal connection: sharing a virtual network or enabling a virtual network peering.

How to find the database credentials?

How to connect to the Cassandra database?

You can connect to the Cassandra database from the same machine where is installed the cluster. Remember that USERNAME and PASSWORD are placeholders. Replace these values with the credentials you have obtained during the server deployment process.

$ cqlsh -u USERNAME -p PASSSWORD

Find out how to obtain application credentials.

How to check the Cassandra server version?

Use the commands below:

$ cqlsh -u cassandra
Password:
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.

cassandra@cqlsh> show version
[cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3]

How to change the Cassandra root password?

You can modify the Cassandra password using the following command at the shell prompt:

$ cqlsh -u cassandra -p USERPASSWORD
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.

cqlsh> ALTER USER cassandra WITH PASSWORD 'NEWPASSWORD';
cqlsh> exit

Remember to replace USERPASSWORD in the previous commands with your current password and NEWPASSWORD with the new password.

How to reset the Cassandra root password?

If you don't remember your Cassandra root password, you can follow the steps below to reset it to a new value:

  • Edit the /opt/bitnami/cassandra/conf/cassandra.yaml file and replace the following lines:

    authenticator: PasswordAuthenticator
    authorizer: CassandraAuthorizer
    

    with:

      authenticator: AllowAllAuthenticator
      authorizer: AllowAllAuthorizer
    
  • Restart your database:

      $ sudo service bitnami restart
    
  • Execute the following commands:

    $ cqlsh
    Connected to Test Cluster at 127.0.0.1:9042.
    [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
    Use HELP for help.
    
      cqlsh> UPDATE system_auth.roles SET salted_hash = '$2a$10$1gMPBy9zSkDzKxdbU2v/gOslcMRPDcXVqmwQYBmi8MVgYvNdRZw/.' WHERE role = 'cassandra';
    cqlsh> exit
    
NOTE: '$2a$10$1gMPBy9zSkDzKxdbU2v/gOslcMRPDcXVqmwQYBmi8MVgYvNdRZw/.' is the output of applying the salted_hash function to the string cassandra.
  • Re-enable the authentication. Undo the changes made in the /opt/bitnami/conf/cassandra.yaml file. Replace the following lines:

    authenticator: AllowAllAuthenticator
    authorizer: AllowAllAuthorizer
    

    with:

      authenticator: PasswordAuthenticator
      authorizer: CassandraAuthorizer
    
  • Now you can access your database using the username cassandra and password cassandra:

      $ cqlsh -u cassandra -p cassandra
    
NOTE: Don't forget to change the cassandra user account password. This is the default password and it's unsecure.

How to create a new user in Cassandra?

The Cassandra superuser is able to create new users and roles. To create a new user with an assigned role, you need to use the CREATE USER statement:

CREATE USER IF NOT EXISTS user_name WITH PASSWORD 'password'
  ( NOSUPERUSER | SUPERUSER )

These are some examples of how to create new users with authentication and authorization information:

CREATE USER mariah WITH PASSWORD 'password1' SUPERUSER;
CREATE USER john WITH PASSWORD 'password2' NOSUPERUSER;

To check the list of users and their privileges, execute the following command:

cassandra@cqlsh> LIST USERS;

name      | super
-----------+-------
cassandra |  True
mariah    |  True
john      |  False

This statement is equivalent to:

cassandra@cqlsh> LIST ROLES;

The Cassandra official documentation has more details about how to create users and roles.

How to create a full backup of Cassandra?

How to create a database snapshot?

To back up the database, create a dump file using the nodetool command. These steps show you how to create a snapshot. Follow these instructions if you want to enable incremental backups in Cassandra.

NOTE: This tutorial assume that you have created at least one keyspace in your database.
  • Check the status of the keyspace. Remember to replace KEYSPACE with the right value.

    $ nodetool status KEYSPACE
    
    datacenter: datacenter1
    ========================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.0.6.66   335.61 KiB  256          39.6%             73d7741f-40a8-4f8f-8f49-073ce37e2c23  rack1
    UN  10.0.4.215  199.95 KiB  256          41.2%             0fa77708-f5a7-4160-93ac-09944fd4c66c  rack1
    UN  10.0.7.42   227.53 KiB  256          41.7%             cd6c98b5-1551-4dff-8fa6-feeb11da32ed  rack1
    
  • Execute the command below to create the snapshot:

    $ nodetool snapshot KEYSPACE
    

    This operation could take some time depending on the database size. You should get an output similar like this:

    Requested creating snapshot(s) for [KEYSPACE] with snapshot name [1483626087852] and options {skipFlush=false}
    Snapshot directory: 1483626087852
    
  • It creates one snapshot per table you have in your keyspace. Find the snapshot at /opt/bitnami/cassandra/data/data/KEYSPACE/TABLENAME-UUID/snapshots/SNAPSHOTNAME. (The UUID is randomly generated. This number depends on your own installation).

How to restore a database backup from a snapshot?

To restore the database from a snapshot, follow the steps below.

NOTE: Remember that you can find the snapshot file at /opt/bitnami/cassandra/data/data/KEYSPACE/TABLENAME-UUID/snapshots/SNAPSHOTNAME. (The UUID is randomly generated. This number depends on your own installation).
  • Drain the node (this is especially important if only a single table is restored). Run the following command:

      $ nodetool drain
    
  • Shut down the node:

      $ sudo service bitnami stop
    
  • Clear all files from the /opt/bitnami/cassandra/data/commitlog directory:

      $ sudo rm /opt/bitnami/cassandra/data/commitlog/*
    
  • Delete all .db files from the /opt/bitnami/cassandra/data/data/system/local-UUID/ directory. (The UUID is randomly generated. This number depends on your own installation):

    | NOTE: Don't delete the *backup* and *snapshots* subdirectories.
    
      $ sudo find . -name "/opt/bitnami/cassandra/data/data/system/local-UUID/*/*.db" -type f -delete
    
  • Copy the content of the last snapshot folder into this directory /opt/bitnami/cassandra/data/data/KEYSPACE/TABLENAME-UUID/:

      $ sudo cp /opt/bitnami/cassandra/data/data/KEYSPACE/TABLENAME-UUID/snapshots/SNAPSHOTNAME /opt/bitnami/cassandra/data/data/KEYSPACE/TABLENAME-UUID/
    

    Remember to replace KEYSPACE in the previous commands with the name of your keyspace and TABLENAME with the name of the table you have created. The UUID and the SNAPSHOTNAME are randomly generated. They depend on your installation.

  • Restart the node:

    $ sudo service bitnami restart
    
  • Check the status of the node:

    $ nodetool status KEYSPACE
    
  • Run nodetool repair to repair one or more tables (to repair more than one table you need to repeat the steps above per each table. However if no tables are listed, the tool operates on all tables):

    $ nodetool repair KEYSPACE
    

This command must show you an output like this:

Repair completed successfully
  Repair command #1 finished in 2 seconds
azure-templates

Bitnami Documentation