In this tutorial, we provide instructions how to set up a personal deep learning environment on the Google Cloud Platform (GCP), and how to create and run a (computing) instance on a virtual machine (VM) which (re)uses an instructor-designed image. Please follow the steps in detail.

Important note: this setup is part of Assignment 0.

Contents


Step 1: Log into Google Cloud Account

  1. Go to the Google cloud console (https://cloud.google.com/) and sign in with your LionMail account (yourUNI@columbia.edu).

    If you sign up with some other account, you will not be able to use google coupons provided by instructors.
  2. If you are a new user of Google cloud, you can get $300 credits for free by clicking 'Get started for free'. You can explore the GCP for a while with free credits. After the add/drop period, students will get educational coupons from instructors to cover course-related google cloud expenses.

  3. Redeem your educational google cloud coupons (Google cloud coupon distribution method TBD). Charges for using a GPU can be approximately $1/hour - so please manage your computational resources wisely. A good way to do this is to create a deep learning environment on your local computer and debug your code there, and only finally run it in the Google Cloud when more powerful computational resources are needed. Note that some assignments can be executed even on non-GPU personal computers.

    If you have received a coupon/code, go to https://console.cloud.google.com/education, select your LionMail account on the top right, and redeem the coupon.

    Now, you can visit your Google cloud dashboard.


Step 2: Create your project and Google Compute Engine (GCE) instance


From Google cloud dashboard:
  1. Create your project

    'Select a project' -> 'NEW PROJECT'. For administrative reasons, we request that you use 'ecbm4040-yourUNI' as your project name. If you have already received coupon from TAs, please choose the course billing account which is automatically created when you redeem the coupon (This is important to get quota increase requests approved smoothly). After a few seconds, you should be able to see your newly created project’s homepage.



  2. Upgrade your billing account (skip if already upgraded)

    If this is your first time using GCP with your columbia ID, you need to upgrade your account to get access to all GCP features. Goto 'Billing' -> 'Overview'. Click 'UPGRADE' option in "Free trail credit" section.

  3. Verify your GPU quota(s)

    Make sure to select the project that you just created, 'ecbm4040-yourUNI'.

    • Go to ‘IAM & Admin’ -> ‘Quotas’. Note that the total number of available services that you should see is around 180 or more.
      If it is around 50, you probably still cannot use GPU services - do the following.
      • Make sure you have upgraded your account
      • Create a new VM instance without GPU (only CPU), make sure that it runs, and delete it. After a while (sometimes a day or so), the number of services in the quota page should increase and it will include GPU services.
    • Select "GPUs (all regions)" in the filter as shown below and request for increase in limit to 1. (More than 1 will not be approved for trial accounts)
    • If "GPUs (all regions)" is not shown in your quotas, type and select "NVIDIA K80 GPUs" in the filter as shown below. Click "All Quotas". Check if limit for all US locations is atleast 1. If the limit is set to 0, select and edit corresponding quotas using "Edit Quotas". Enter Name, e-mail and phone number and then click NEXT. Set New Limit to 1 and enter any appropriate description for the request. (Requesting more than 1 might require you to go through additional procedures with Google cloud sales team). Wait for a moment to let Google process your request. You should receive an e-mail from Google informing you that they received the request. You will receive another e-mail after your quota request is approved. Note that the quota editing request would be processed typically in one or two business days, as claimed by Google. But the actual waiting period might vary from minutes to a few hours to 4 days or even longer, depending on the general quota demand. Typically, it takes longer for Google to process the requests at the end of the semester. Please be aware of that fact and manage your time for project experiments at the end of semester properly.


  4. Create a new GCE virtual machine (VM) instance.

    There are 2 options: (a) To create a VM instance based on the custom image provided by the ECBM E4040 instructors, which includes all tools that you need (CUDA, Anaconda, Jupyter Notebook, Tensorflow, etc.), proceed to the steps below - this option is highly recommended; (b) If you are interested in exploring GCP on your own, and would like to configure your own VM instance from scratch, study the detailed instructions available in GCP instructions.

    Creating your instance in your GCP project, based on a pre-created custom image

    • Go to ‘Compute Engine’ -> ‘VM instances’, click ‘create’.
    • Define your instance’s name and set zone to ‘us-east1-d’. (Change the zone if requested resources are not available. This problem may occur frequently - you may need to spend time to understand which zones are better at which time. Focus on zones which are on the eastern side of USA).
    • Configure your instance settings: you can choose the number and type of CPUs and GPUs and memory size. But keep the cost on your mind!

      Note: check GPU availability in various zones on this site, you may need to experiment with different zones.

    • Here are the suggested settings (which should be good enough for all E4040 assignments):
      • Region: us-east1, Zone: us-east1-d
      • CPU: 2, Memory: 7.5GB
      • GPU: 1 NVIDIA Tesla K80 (consdier T4 as well, check cost comparison)
      • In the boot disk section, click 'change' and then select from custom images ‘ecbm4040-imageforstudents-tf22’ which is under the project ‘ecbm4040-ta’.

        This image is pre-installed, and has the following specs

        • user - ecbm4040
        • conda environment name - envTF22
        • CUDA: 10.1
        • cudnn: 7.6.5
        • Anaconda 3 (python 3.7)
        • tensorflow-gpu 2.2
        • OS: Ubuntu 18.04.4
      • Check ‘Allow HTTP traffic’ and ‘Allow HTTPS traffic’.
      • Note: you can later create additional different instances with various computational power for your project, the procedure is the same.

    • Wait for several minutes. The newly created VM instance will be running after the creation.

Step 3: Connect to your GCP instance

Before showing how to start an instance, we want to emphasize that you should always STOP the instance when you are not using it. You always get charged per hour while the instance is running, so remember to stop the instance every time after you finish your work. You do not have to delete the instance, it is enough to stop it.

There are two methods to establish a connection to your cloud GCP instance from your personal computer: one is using the Google Cloud SDK, and the other is based on GCP firewall settings.

Step 4: Check the Tools

This step will check whether the tools have been properly installed, and if they are available in your environment.

  1. CUDA tool verification: First, check whether a GPU device is available:

    ecbm4040@your-instance-name: $ nvidia-smi

    If GPU is available, that output will show some basic information about your GPU device.

    Second, verify CUDA installation:

    ecbm4040@your-instance-name: $ nvcc -V

    If it is correctly installed, this command will return the information about CUDA version, similar to the figure below:


  2. Anaconda is used for managing python and other software versions and environments. (Note that for the local computer setup, we described the installation of Anaconda 3 in detail in another tutorial.) In the instructor's custom image, a conda environment called 'envTF22' has been set up. You need to use the instruction below to activate it. It is recommended that you use the same environment for your future assignments. If you need additional tools, they can be added by using conda install or pip install commands.

    (base) ecbm4040@your-instance-name: $ conda activate envTF22

    After the activation of the environment, you can review which packages are currently installed using the command conda list:

    (envTF22) ecbm4040@your-instance-name: $ conda list

    Note: If you need to deactivate the environment, type

    deactivate
  3. TensorFlow is an open-source library for deep learning provided by Google. The version of TensorFlow in the cloud image which is provided by the instructors is 2.2.0. That is the version that should be used to complete the assignments for E4040 in 2020.

    To check the installation of TensorFlow 2.2.0, type python, and run the following code inside the python prompt.

    (Note: Do not confuse Python prompt >> with the Linux command prompt $. If you want to exit python, type exit() to get back to the Linux prompt.)
     python    
     >> import tensorflow as tf  
     >> print('TensorFlow Version=') 
     TensorFlow Version=  
     >> tf.__version__    
    '2.2.0'

Step 5: Jupyter Notebook

We next describe (a) how to start a Jupyter server in your Google Cloud VM instance, and (b) how to open/access your Jupyter notebook There are two ways to accomplish this: (i) Method 1 - Using the console of the Google SDK running on your laptop; and (ii) Method 2 - Configuring a firewall from the GCP dashboard.

Jupyter tools have been installed in the 'envTF22' virtual environment in the GCP instance.

Configuring and starting Jypyter server on the GCP

  1. Configure your Jupyter Notebook on the server side

    First, generate a new configuration file:

    (envTF22)ecbm4040@your-instance-name: $ jupyter notebook --generate-config

    Open that configuration file:

    (envTF22)ecbm4040@your-instance-name: $ vi ~/.jupyter/jupyter_notebook_config.py

    Add the following lines into the file. (If you are new to Linux and do not know how to use the vi editor, see this tutorial: https://www.cs.colostate.edu/helpdocs/vi.html).

    c = get_config()
    c.NotebookApp.ip='*'
    c.NotebookApp.open_browser = False
    c.NotebookApp.port =9999      # or other port number
  2. Generate your Jupyter login password, press Enter for no password.

    (envTF22)ecbm4040@your-instance-name: $ jupyter notebook password
    Enter password:  
    Verify password: 
    [NotebookPasswordApp] Wrote hashed password to /Users/you/.jupyter/jupyter_notebook_config.json
  3. Start Jupyter server in your Google Cloud VM instance

    (envTF22)ecbm4040@your-instance-name: $ jupyter notebook

Opening the Jupyter Notebook:

Your Jupyter server is running remotely in your GCP instance. You need to connect your local computer to that remote server in order to view, edit and run your Jupyter notebook files from a browser on your laptop (Chrome, Firefox, etc.).

Method 1 - Open Jupyter Noteboook using the Google cloud SDK

  1. Open an SDK console and use SSH to connect to the Jupyter notebook. Type in the following code to set up a connection with your remote instance. Note that in “-L 9999:localhost:9999”, the first “9999” is your local port and that you can set another port number if you want. The second “9999” is the remote port number and it should be the same as the port that the jupyter notebook server is using.

    gcloud compute ssh --ssh-flag="-L 9999:localhost:9999"  --zone "us-east1-d" "ecbm4040@your-instance-name"
  2. Open a browser on your laptop (Chrome, IE etc.)

    Go to http://localhost:9999 or https://localhost:9999 and you will be directed to your remote Jupyter server. Type in the Jupyter password that you created before, and now you can enter your home directory in the linux virtual machine, which is running in the GCP.


Method 2 - Open Jupyter Noteboook by configuring a firewall from the GCP dashboard

You have finished the tool installation component of the Assignment 0.

*Step 6: Other useful tools in GCP

  1. Tmux, a screen multiplexer. It allows you to run multiple programs in multiple window panes within one terminal. That capability makes Tmux a popular tool for working on a remote server (such as GCP) while connected from a personal computer. If you want to explore more applications of Tmux, click here, or read this note: http://deeplearning.lipingyang.org/2017/06/28/tmux-resources/

    How to use Tmux for working with GCP:

    • First, activate your virtual envioronment:
      (base) ecbm4040@your-instance-name: $ conda activate envTF22
    • Next, create a Tmux session:

      (envTF22) ecbm4040@your-instance-name: $ tmux new -s session1

      Then you will be in the session named 'session1'.

    • Now all the processes will be executing within the session. For example, you can open Jupyter notebook in the Tmux command window just as we previosly introduced to you.
      (envTF22) ecbm4040@your-instance-name: $ jupyter notebook

    For more Tmux commands, refer to this link: https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/.

    The biggest advantage of Tmux is that it allows a process to keep running even when your laptop is disconnected from your instance in the cloud. If your network has accidentially broken, or you need to close your laptop, the process would still be running in the cloud session, unless you kill the whole session. We highly recommend that you train time-consuming deep learning models in a Tmux session.

  2. runipy is used to run a *.ipynb file as a script. When you are running Jupyter on a remote server or on cloud resources, there are situations when you would like the Jupyter to continue running without termination when you shut down your laptop. Tmux is helpful, but you may also need to run Jupyter in a command-line environment.
    • First, install runipy in your virtual enviroment (or continue within a Tmux session created from 'envTF22').
      (envTF22) ecbm4040@your-instance-name: $ pip install runipy
    • Suppose that you opened a Jupyter notebook, and used SSH to connect to it. Attach to your created Tmux session ('session1' here). Split the window panes. Switch to the new window pane, then you can use runipy to run your .ipynb file.

    For more details, see http://deeplearning.lipingyang.org/2018/03/29/run-jupyter-notebook-from-terminal-with-tmux/.


ECBM E4040 Neural Networks and Deep Learning, 2020.

Columbia University