Deploy your Bitnami TensorFlow Serving Stack on Google Cloud Platform now! Launch Now

Bitnami TensorFlow Serving for Google Cloud Platform

Description

TensorFlow Serving is a system for serving machine learning models. This stack comes with Inception v3 with trained data for image recognition, but it can be extended to serve other models.

What is TensorFlow?

According to the TensorFlow web site, "TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well".

For more information about TensorFlow, check the official TensorFlow Serving web site.

What is TensorFlow Serving?

TensorFlow Serving is "a flexible, high-performance serving system for machine learning models, designed for production environments. TensorFlow Serving makes it easy to deploy new algorithms and experiments, while keeping the same server architecture and APIs. TensorFlow Serving provides out-of-the-box integration with TensorFlow models, but can be easily extended to serve other types of models and data". (credit: TensorFlow Serving web site).

For more information about TensorFlow Serving, check the official site.

What is Inception model?

The Inception model is one of the available TensorFlow models for image recognition. For more information check this Image Recognition tutorial.

First steps with the Bitnami TensorFlow Serving Stack

Welcome to your new Bitnami application running on Google Cloud Platform! Here are a few questions (and answers!) you might need when first starting with your application.

What credentials do I need?

You need a set of credentials: the server credentials that allow you to log in to your Google Cloud Platform server using an SSH client and execute commands on the server using the command line. These credentials consist of an SSH username and key.

What SSH username should I use for secure shell access to my application?

SSH username: bitnami

How to start or stop the services?

Each Bitnami stack includes a control script that lets you easily stop, start and restart services. The script is located at /opt/bitnami/ctlscript.sh. Call it without any service name arguments to start all services:

$ sudo /opt/bitnami/ctlscript.sh start

Or use it to restart a single service, such as Apache only, by passing the service name as argument:

$ sudo /opt/bitnami/ctlscript.sh restart apache

Use this script to stop all services:

$ sudo /opt/bitnami/ctlscript.sh stop

Restart the services by running the script without any arguments:

$ sudo /opt/bitnami/ctlscript.sh restart

Obtain a list of available services and operations by running the script without any arguments:

$ sudo /opt/bitnami/ctlscript.sh

What utilities are included in TensorFlow Serving?

What are the default ports?

A port is an endpoint of communication in an operating system that identifies a specific process or a type of service. Bitnami stacks include several services or servers that require a port.

IMPORTANT: Making this application's network ports public is a significant security risk. You are strongly advised to only allow access to those ports from trusted networks. If, for development purposes, you need to access from outside of a trusted network, please do not allow access to those ports via a public IP address. Instead, use a secure channel such as a VPN or an SSH tunnel. Follow these instructions to remotely connect safely and reliably.

Port 22 is the default port for SSH connections.

How to upload files to the server with SFTP?

NOTE: Bitnami applications can be found in /opt/bitnami/apps.
  • If you are using the Bitnami Launchpad for Google Cloud Platform, obtain your server SSH key by following these steps:

    • Browse to the Bitnami Launchpad for Google Cloud Platform dashboard and sign in if required using your Bitnami account.
    • Select the "Virtual Machines" menu item.
    • Select your cloud server from the resulting list.
    • Download the SSH key for your server in PPK or PEM format. Note the server IP address on the same page.

Server information

NOTE: Replace USERNAME in the commands below with your Google Cloud platform username.
	$ sudo su USERNAME
	$ ssh-keygen -t rsa -f ~/.ssh/my-ssh-key -C USERNAME
  • Enter the passphrase twice. The SSH key pair will be generated and saved in /home/USERNAME/.ssh/my-ssh-key and /home/USERNAME/.ssh/my-ssh-key.pub.

Although you can use any SFTP/SCP client to transfer files to your server, this guide documents FileZilla (Windows, Linux and Mac OS X), WinSCP (Windows) and Cyberduck (Mac OS X).

Using an SSH Key

Once you have your server's SSH key, choose your preferred application and follow the steps below to connect to the server using SFTP.

FileZilla
IMPORTANT: To use FileZilla, your server private key should be in PPK format.

Watch the following video to learn how to upload files to your Google Cloud server with SFTP using FileZilla. The example below shows a server launched using the Bitnami launchpad, but this video is applicable to those servers launched using the GCP Marketplace.

Or you can follow these steps:

  • Download and install FileZilla.
  • Launch FileZilla and use the "Edit -> Settings" command to bring up FileZilla's configuration settings.
  • Within the "Connection -> SFTP" section, use the "Add keyfile" command to select the private key file for the server. FileZilla will use this private key to log in to the server.

    FileZilla configuration

  • Use the "File -> Site Manager -> New Site" command to bring up the FileZilla Site Manager, where you can set up a connection to your server.
  • Enter your server host name and specify bitnami as the user name.
  • Select "SFTP" as the protocol and "Ask for password" as the logon type.

    FileZilla configuration

  • Use the "Connect" button to connect to the server and begin an SFTP session. You might need to accept the server key, by clicking "Yes" or "OK" to proceed.

You should now be logged into the /home/bitnami directory on the server. You can now transfer files by dragging and dropping them from the local server window to the remote server window.

If you have problems accessing your server, get extra information by use the "Edit -> Settings -> Debug" menu to activate FileZilla's debug log.

FileZilla debug log

WinSCP
IMPORTANT: To use WinSCP, your server private key should be in PPK format.

Follow these steps:

  • Download and install WinSCP.
  • Launch WinSCP and in the "Session" panel, select "SCP" as the file protocol.
  • Enter your server host name and specify bitnami as the user name.

    WinSCP configuration

  • Click the "Advanced…" button and within the "SSH -> Authentication -> Authentication parameters" section, select the private key file for the server. WinSCP will use this private key to log in to the server.

    WinSCP configuration

  • From the "Session" panel, use the "Login" button to connect to the server and begin an SCP session.

You should now be logged into the /home/bitnami directory on the server. You can now transfer files by dragging and dropping them from the local server window to the remote server window.

If you need to upload files to a location where the bitnami user doesn't have write permissions, you have two options:

  • Once you have configured WinSCP as described above, click the "Advanced…" button and within the "Environment -> Shell" panel, select sudo su - as your shell. This will allow you to upload files using the administrator account.

    WinSCP configuration

  • Upload the files to the /home/bitnami directory as usual. Then, connect via SSH and move the files to the desired location with the sudo command, as shown below:

     $ sudo mv /home/bitnami/uploaded-file /path/to/desired/location/
    
Cyberduck
IMPORTANT: To use Cyberduck, your server private key should be in PEM format.

Follow these steps:

  • Select the "Open Connection" command and specify "SFTP" as the connection protocol.

    Cyberduck configuration

  • In the connection details panel, under the "More Options" section, enable the "Use Public Key Authentication" option and specify the path to the private key file for the server.

    Cyberduck configuration

  • Use the "Connect" button to connect to the server and begin an SFTP session.

You should now be logged into the /home/bitnami directory on the server. You can now transfer files by dragging and dropping them from the local server window to the remote server window.

How to connect instances hosted in separate virtual networks or VPCs?

The Google Cloud Platform makes it possible to connect instances hosted in separate Virtual Private Clouds (VPCs), even if those instances belong to different projects or are hosted in different regions. This feature, known as VPC Network Peering, can result in better security (as services do not need to be exposed on public IP addresses) and performance (due to use of private, rather than public, networks and IP addresses).

Learn more about VPC Network Peering.

How to connect to TensorFlow Serving from a different machine?

For security reasons, the TensorFlow Serving port in this solution cannot be accessed over a public IP address. To connect to TensorFlow Serving from a different machine, you must either create an SSH tunnel or open port 9000 for remote access. Refer to the FAQ for more information on creating an SSH tunnel or opening server ports.

IMPORTANT: Making this application's network ports public is a significant security risk. You are strongly advised to only allow access to those ports from trusted networks. If, for development purposes, you need to access from outside of a trusted network, please do not allow access to those ports via a public IP address. Instead, use a secure channel such as a VPN or an SSH tunnel. Follow these instructions to remotely connect safely and reliably.

Once you have an active SSH tunnel or you opened the port for remote access, you can then connect to TensorFlow Serving using the Inception client a command like the one below. Replace SOURCE-PORT with the source port number specified in the SSH tunnel configuration or 9000 if you opened the port for remote access, and HOST with 127.0.0.1 if you have an SSH tunnel or the host's actual IP address otherwise.

$ inception_client --server=HOST:SOURCE-PORT --image=/tmp/example.jpg

How can I run a command in the Bitnami TensorFlow Serving Stack?

Log in to the server console as the bitnami user and run the command as usual. The required environment is automatically loaded for the bitnami user.

How to change the TensorFlow Serving model configuration file?

TensorFlow Serving is ready to be used with the Inception v3 model. You may want to use a different version of the model or even a different one.

You can change the configuration settings of the model by editing the /opt/bitnami/tensorflow-serving/conf/tensorflow-serving.conf file:

model_config_list: {
config: {
name: "inception",
base_path: "/opt/bitnami/tensorflow-serving/data",
model_platform: "tensorflow",
}
}

Once your changes are done, restart the service:

$ sudo /opt/bitnami/ctlscript.sh restart

How to test Bitnami TensorFlow Serving Stack for image recognition?

TensorFlow Serving is ready to be used with the Inception v3 model. The necessary training data needed is located at /opt/bitnami/tensorflow-serving/data.

First, download the image that you want to be recognised:

$ cd /tmp
$ curl -LO 'https://www.tensorflow.org/images/grace_hopper.jpg'

In order to send the query to the server, you just need to execute the following command:

$ inception_client --image=grace_hopper.jpg

You will get a response from the server in JSON format with the objects that were recognised in the image. The output can be similar to this:

Query output using a sample image

As an alternative, you can also query the server remotely. As you will need a client to make the requests, we recommend you to use our Bitnami Docker TensorFlow Inception container.

You just need to execute the following commands (substitute the SERVER_IP placeholder with the real IP address of your server):

$ docker run --name tensorflow-inception -d bitnami/tensorflow-inception:latest
$ docker exec tensorflow-inception inception_client --server=SERVER_IP:9000 --image=/opt/bitnami/tensorflow-inception/tensorflow/tensorflow/contrib/ios_examples/benchmark/data/grace_hopper.jpg

For more info about our Kubernetes solution, check our guide Perform Machine-Based Image Recognition With TensorFlow On Kubernetes.

How to compile example clients other than Inception?

NOTE: The Bitnami TensorFlow Serving Stack is configured to deploy the TensorFlow Inception Serving API. This image also ships other tools like Bazel or the TensorFlow Python library for training models. Training operations will require higher hardware requirements in terms of CPU, RAM and disk. It is highly recommended to check the requirements for these operations and scale your server accordingly.

As an example, this section describes how to compile and test the mnist utilities:

  • Clone the TensorFlow Serving repository and checkout the compatible versions of TensorFlow Serving and TensorFlow:

      $ cd ~
      $ git clone https://github.com/tensorflow/serving.git
      $ cd serving/
      $ git submodule update --init
      $ git checkout 0.6.0
      $ cd tensorflow
      $ git checkout v1.2.0
    
  • In the TensorFlow submodule, execute the configure step with the values that you want to use:

      $ cd ~/serving/tensorflow
      $ ./configure
    
  • Compile the client tools mnist_client and mnist_saved_model:

      $ cd ~/serving/
      $ bazel build --action_env=PYTHON_BIN_PATH=/opt/bitnami/python/bin/python \
          --compilation_mode=opt --strip=always --nobuild_runfile_links \
          //tensorflow_serving/example:mnist_client //tensorflow_serving/example:mnist_saved_model
    

    This operation may take a considerable amount of time (depending on the hardware of your machine), please be patient.

Once the compilation is done, the utilities are ready to be used. To test both of them, execute the following commands:

  • Export the model:

      $ bazel-bin/tensorflow_serving/example/mnist_saved_model /tmp/mnist_model
    
  • Stop the already running service:

      $ sudo /opt/bitnami/ctlscript.sh stop
    
  • Start the server with valid parameters for mnist:

      $ tensorflow_model_server --port=9000 --model_name=mnist --model_base_path=/tmp/mnist_model/ &
    
  • Use the mnist client:

      $ bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=localhost:9000
    
  • Once the test is done, restart the server with the default values:

      $ sudo pkill -f tensorflow_model_server.*model_name=mnist
      $ sudo /opt/bitnami/ctlscript.sh stop
    

For more information, please check the TensorFlow Serving tutorial.

How to launch TensorBoard?

NOTE: The Bitnami TensorFlow Serving Stack is configured to deploy the TensorFlow Inception Serving API. This image also ships other tools like Bazel or the TensorFlow Python library for training models. Training operations will require higher hardware requirements in terms of CPU, RAM and disk. It is highly recommended to check the requirements for these operations and scale your server accordingly.
  • Execute the TensorBoard server:

      $ tensorboard --logdir=path/to/log-directory
    

By default the port for the TensorBoard service is 6006. You will need to create an SSH tunnel to access it. Refer to the FAQ if you need help with this.

For more information, please check this the TensorBoard: Visualizing Learning guide.

Where can I find utilities to train a model?

NOTE: The Bitnami TensorFlow Serving Stack is configured to deploy the TensorFlow Inception Serving API. This image also ships other tools like Bazel or the TensorFlow Python library for training models. Training operations will require higher hardware requirements in terms of CPU, RAM and disk. It is highly recommended to check the requirements for these operations and scale your server accordingly.

The imagenet_train and imagenet_eval utilities are already compiled in our stack. You can find them at /opt/bitnami/tensorflow-serving/bin.

Please read the Inception model training guide if you want to know more about this.

How to enable NVIDIA GPU support?

NOTE: The steps below require you to download various libraries and recompile TensorFlow Serving with GPU support. Before proceeding, ensure that the host system has the necessary disk space, CPU and RAM to handle heavy compilation workloads.

To enable NVIDIA GPU support in TensorFlow Serving, follow these steps:

  • Install the build tools and Git (if not already installed):

    $ sudo apt-get install git build-essential
    
  • Install the kernel sources for your running kernel:

    $ sudo apt-get source linux-source
    $ sudo apt-get source linux-image-$(uname -r)
    $ sudo apt-get install linux-headers-$(uname -r)
    
  • Download the CUDA Toolkit and latest patches for your platform.
  • Run the following command to install the CUDA Toolkit:

    $ chmod +x cuda_X.Y.Z_linux-run
    $ sudo ./cuda_X.Y.Z_linux-run
    

    Read and confirm your acceptance of the EULA, and answer the pre-installation questions when prompted. Make a note of the CUDA Toolkit installation directory.

    NOTE: The remaining steps in this section will assume that the CUDA Toolkit was installed to the default location of /usr/local/cuda.
    To troubleshoot issues related to your CUDA installation, refer to this helpful troubleshooting guide by Victor Antonino.
  • Repeat the previous step for any CUDA Toolkit patches that were downloaded as well.
  • Once the CUDA Toolkit is installed, sign up for the free NVIDIA Developer Program (if you are not already a member) to download the NVIDIA CUDA Deep Neural Network library (cuDNN) v6.0.

    NOTE: The cuDNN v6.0 library is available for different versions of the CUDA Toolkit. Ensure that you download the cuDNN v6.0 library that also matches the previously-installed CUDA Toolkit version.
  • Run the following commands to install the cuDNN library:

    $ tar -xzvf cudnn-X.Y-linux-x64.tgz
    $ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
    $ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    $ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
    
  • Download and install the latest NCCL library from its GitHub repository:

    $ git clone https://github.com/NVIDIA/nccl.git
    $ cd nccl/
    $ make CUDA_HOME=/usr/local/cuda
    $ sudo make install
    $ sudo mkdir -p /usr/local/include/external/nccl_archive/src
    $ sudo ln -s /usr/local/include/nccl.h /usr/local/include/external/nccl_archive/src/nccl.h
    
  • Download TensorFlow Serving from its GitHub repository into your home directory using the command below:

    $ git clone --recurse-submodules https://github.com/tensorflow/serving ~/serving
    
  • Configure the build, making sure to say "Yes" when prompted to enable GPU processing. Leave the remaining options at their default values.

    $ cd ~/serving/tensorflow
    $ ./configure
    
  • Edit the tools/bazel.rc file in the repository root directory and make the following changes:

    • Due to a bug, change @org_tensorflow//third_party/gpus/crosstool to @local_config_cuda//crosstool:toolchain.

    • Update all instances of the PYTHON_BIN_PATH variable to use the Python binary included in the Bitnami TensorFlow Serving Stack at /opt/bitnami/python/bin/python.

    After making these changes, the edited bazel.rc file should look like this:

    build:cuda --crosstool_top=@local_config_cuda//crosstool:toolchain
    build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true
    build --force_python=py2
    build --python2_path=/opt/bitnami/python/bin/python
    build --action_env PYTHON_BIN_PATH="/opt/bitnami/python/bin/python"
    build --define PYTHON_BIN_PATH=/opt/bitnami/python/bin/python
    test --define PYTHON_BIN_PATH=/opt/bitnami/python/bin/python
    run --define PYTHON_BIN_PATH=/opt/bitnami/python/bin/python
    build --spawn_strategy=standalone --genrule_strategy=standalone
    test --spawn_strategy=standalone --genrule_strategy=standalone
    run --spawn_strategy=standalone --genrule_strategy=standalone    
    
  • Compile TensorFlow Serving with GPU support with the commands below. Depending on the server specification, this process can take an hour or longer.

    $ cd ~/serving
    $ bazel clean --expunge && export TF_NEED_CUDA=1
    $ bazel build --config=opt --config=cuda tensorflow_serving/...
    
  • Stop the TensorFlow Serving service:

    $ sudo /opt/bitnami/ctlscript.sh stop tensorflowserving
    
  • Copy the newly-compiled binary files and libraries for TensorFlow Serving into the Bitnami stack directory:

    $ sudo mv /opt/bitnami/tensorflow-serving/bin/tensorflow_model_server /opt/bitnami/tensorflow-serving/bin/tensorflow_model_server.old
    $ sudo cp ~/serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server_test_client.runfiles/local_config_cuda/cuda/cuda/lib/* /lib
    $ sudo cp ~/serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /opt/bitnami/tensorflow-serving/bin/
    
  • Start the TensorFlow Serving service:

    $ sudo /opt/bitnami/ctlscript.sh start tensorflowserving
    

You should now be able to use TensorFlow Serving with GPU support enabled.

How to check that TensorFlow Serving is running with NVIDIA GPU support?

Confirm that the TensorFlow Serving service is running with NVIDIA GPU support using either of these methods:

  • Use the ldd utility on the TensorFlow Serving binary and confirm that the output lists the CUDA, cuDNN and NVIDIA libraries, as shown in the example below:

     $ ldd /opt/bitnami/tensorflow-serving/bin/tensorflow_model_server      
     linux-vdso.so.1 (0x00007ffdb69d1000)
     libcusolver.so.8.0 => /lib/libcusolver.so.8.0 (0x00007fb5efe90000)
     libcublas.so.8.0 => /lib/libcublas.so.8.0 (0x00007fb5ece4b000)
     libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007fb5ec454000)
     libcudnn.so.6 => /lib/libcudnn.so.6 (0x00007fb5e2ef2000)
     libcufft.so.8.0 => /lib/libcufft.so.8.0 (0x00007fb5da0a3000)
     libcurand.so.8.0 => /lib/libcurand.so.8.0 (0x00007fb5d612c000)
     libcudart.so.8.0 => /lib/libcudart.so.8.0 (0x00007fb5d5ec6000)
     librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb5d5cbe000)
     libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb5d5aa0000)
     libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb5d589c000)
     libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fb5d5686000)
     libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb5d5384000)
     libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb5d5079000)
     libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb5d4e63000)
     libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb5d4ab7000)
     libnvidia-fatbinaryloader.so.375.26 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.375.26 (0x00007fb5d486b000)
     /lib64/ld-linux-x86-64.so.2 (0x000055e6cf710000) 
    
  • Check the server log for messages indicating that the NVIDIA GPU modules have been loaded, as shown in the example below:

     $ tail -f /var/log/messages
     kernel: [ 2780.447221] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 250      
    

How to create a full backup of TensorFlow Serving?

Backup

The Bitnami TensorFlow Serving Stack is self-contained and the simplest option for performing a backup is to copy or compress the Bitnami stack installation directory. To do so in a safe manner, you will need to stop all servers, so this method may not be appropriate if you have people accessing the application continuously.

Follow these steps:

  • Change to the directory in which you wish to save your backup:

      $ cd /your/directory
    
  • Stop all servers:

      $ sudo /opt/bitnami/ctlscript.sh stop
    
  • Create a compressed file with the stack contents:

      $ sudo tar -pczvf application-backup.tar.gz /opt/bitnami
    
  • Restart all servers:

      $ sudo /opt/bitnami/ctlscript.sh start
    

You should now download or transfer the application-backup.tar.gz file to a safe location.

Restore

Follow these steps:

  • Change to the directory containing your backup:

      $ cd /your/directory
    
  • Stop all servers:

      $ sudo /opt/bitnami/ctlscript.sh stop
    
  • Move the current stack to a different location:

      $ sudo mv /opt/bitnami /tmp/bitnami-backup
    
  • Uncompress the backup file to the original directoryv

      $ sudo tar -pxzvf application-backup.tar.gz -C /
    
  • Start all servers:

      $ sudo /opt/bitnami/ctlscript.sh start
    

If you want to create only a database backup, refer to these instructions for MySQL and PostgreSQL.

google

Bitnami Documentation