UPDATE 8/30/2022 - The course does not support local (PC/laptop) environments any more. All assignments must be done in GCP/GCE, and the first assignment may be done in google COLAB.

UPDATE 9/13/2021 - This local environment setup is no longer supported. If you wish to setup a local computing environment for Fall 2021 and later, download the envTF24 requirements.txt and reference this tutorial.

In this tutorial, we will guide you through the steps of building an Anaconda-based working environment with TensorFlow and PyTorch on your local computer/machine. For E4040 course we will be using TensorFlow, but PyTorch is also a very popular framework that you might encounter in the future. Please follow these steps carefully in order to do your assignments. Note that the assignments will be based on particular versions of Tensorflow and Python, which may not be the latest versions (downloadable by default from public websites).

System requirements

Contents

Installation Guide

The installation process is time-consuming and complex, so please use an external power supply for your computer. For reference, the official installation instructions for TensorFlow are provided here: https://www.tensorflow.org/install/. Note that versions of tools used in the official instructions may not be the same as versions of tools required for E4040 assignments, so you are advised to follow our installation steps carefully.

Step 1: Anaconda Installation

Anaconda is the most popular Python data science platform. It provides a Python package manager that lets you install, update and remove packages.

A useful link for information about Anaconda: https://docs.anaconda.com/anaconda/.

Step 2: Create Anaconda virtual environment

We are going to create a virtual environment on the Anaconda (conda) platform, and install necessary modules/packages/tools into that virtual environment. A virtual environment is a named, isolated, working copy of Python and other packages, which maintains its own files, directories and paths so that you can work with specific versions of software libraries and Python versions without affecting other projects. Virtual environments make it easy to cleanly separate different projects and avoid problems with tool dependencies and tool version requirements across software components. conda command is the preferred interface for managing installations and virtual environments with the Anaconda Python distribution (getting started with conda).

(Optional) Step 3: Install CUDA and GPU drivers

This step is optional. This step should not be done for computers without GPUs.

Compute Unified Device Architecture (CUDA) is a parallel computing platform and programming model created by NVIDIA. It harnesses the full power of graphics processing units (GPUs) for deep learning purposes.

(Optional) Step 4: Install cuDNN

This step is optional. This step should not be done for computers without GPUs.

cuDNN is a GPU-accelerated library for deep learning https://developer.nvidia.com/cudnn.

This step requires you to create an NVIDIA account.

If you are familiar with the PATH environment variable setup, you can follow the official instructions provided after you login into the NVIDIA account. Proper PATH setup would make it possible for your computer to automatically locate cuDNN *.dll files.

The alternative (manual) installation instructions are here:

Step 5: Install TensorFlow

TensorFlow is an open source deep learning framework created and maintained by Google Brain Team https://www.tensorflow.org .

TensorFlow is a rapidly evolving deep learning framework, and new versions are released frequently. As of June 2020, TensorFlow 2.2 has been released, which incorporates new features. official guide. For running your assignments, you will also be asked to use a Google Cloud (gcp) instance, for which the instuctions are shown in another instruction manual; Google Cloud Image that we provided has tensorflow 2.2 (tensorflow-gpu 2.2 in case you have a gpu on your local machine) and thus the same version of TensorFlow should be used on your local mahine.

TensorFlow Installation

Note: Always activate your virtual environment (our suggestion is "envTF22").

TensorFlow 2.2 installation:

Verify TensorFlow installation:

Step 6: Jupyter Notebook (and Jupyterlab)

Jupyter is a web-based Python programming environment, allowing you to edit code, display output results and plots, and show animations. You can even create a finely written report in Jupyter notebook, since it supports Latex grammar. For course assignments, we will require you to use Jupyter to do your work and demonstrate the results.

Jupyter Notebook installation (into conda/Anaconda environment):

  1. Install JupyterLab: JupyterLab provides a web-based user interface which helps with organization of Jupyter projects, including Jupyter notebooks, text editors, terminals, etc.

    Note: Make sure that you install it inside your virtual environment (suggested "envTF22"). Jupyterlab package comes distributed with Anaconda 3, but to make it available inside your virtual environment, you need to type

                    conda install -c conda-forge jupyterlab 
                

    Jupyter Notebook is installed by default with JupyterLab. One can then either directly start Jupyter Notebook (*.ipynb) files, or start JupyterLab to open Jupyter Notebook files indirectly ("*.ipynb" comes from the five letters of the older name of Jupyter notebook = ipython notebook).

  2. Open Jupyter Notebook: In your virtual environment, type: jupyter notebook or jupyterlab.

    Though JupyterLab can help in better managament of Jupyter projects, it is simpler to use jupyter notebook directly. We leave for you if you wish to explore JupyterLab features by yourself (see this website).

    As the figure below shows, first activate your virtual environment if it is not already activated, then type jupyter notebook to start Jupyter notebook - it will open inside your browser. Now, You can view your Jupyter notebooks inside your browser (Chrome, IE, Safari etc.).

  3. Optional: If you wish to experiment with jupyter notebooks, here are some tutorial links.

    http://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/

    http://ipywidgets.readthedocs.io/en/latest/examples/Lorenz%20Differential%20Equations.html

(Optional) Step 7: Install PyTorch

PyTorch is another open source machine learning framework for Python, based on Torch. It has been developed by Facebook's artificial-intelligence research group. Compared to TensorFlow, one of PyTorch advantages is the implicit dynamic network design.

PyTorch will not be used in E4040 course.

PyTorch installation: go to the official website http://pytorch.org/ and follow the install instructions, choosing the correct versions of Python and CUDA.


ECBM E4040 Neural Networks and Deep Learning

Columbia University