In this tutorial, we will guide you through the steps of building an Anaconda-based working environment with TensorFlow and PyTorch on your local computer/machine. For E4040 course we will be using TensorFlow, but PyTorch is also a very popular framework that you might encounter in the future. Please follow these steps carefully in order to do your assignments. Note that the assignments will be based on particular versions of Tensorflow and Python, which may not be the latest versions (downloadable by default from public websites).
The installation process is time-consuming and complex, so please use an external power supply for your computer. For reference, the official installation instructions for TensorFlow are provided here: https://www.tensorflow.org/install/. Note that versions of tools used in the official instructions may not be the same as versions of tools required for E4040 assignments, so you are advised to follow our installation steps carefully.
Anaconda is the most popular Python data science platform. It provides a Python package manager that lets you install, update and remove packages.
python
and hit
Enter. You should see the following text (or similar text mentioning Anaconda):
"(base) C:\Users\zoran>python
Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32".
This would indicate that you are using the desired Anaconda-installation of Python.
Type exit()
to quit Python.
cmd
. In the window, type python
and hit
Enter. If you see the following python interpreter, which indicates that it is the Anaconda
interpreter, then it is finished. Type exit()
to quit Python.
python
instruction doesn't work (in which case the system can not find this command), or the
interpreter is not the Anaconda interpreter (in which case you have installed Python before Anaconda installation
and Anaconda is not specified as your default Python), you may try one of the following:
conda list
in the command window to see the list of all packages
which you have installed.
The most important packages for the begining are Python (3.7 version comes with Anaconda 3) and
Jupyterlab (which contains Jupyter Notebook.
A useful link for information about Anaconda: https://docs.anaconda.com/anaconda/.
We are going to create a virtual environment on the Anaconda (conda) platform, and install necessary
modules/packages/tools into that virtual environment. A virtual environment is a named,
isolated, working copy of Python and other packages, which maintains its own files,
directories and paths so that you can work with specific versions of software libraries and Python versions without
affecting other projects.
Virtual environments make it easy to cleanly separate different projects and avoid problems with tool
dependencies and tool version requirements across software components.
conda
command is the preferred interface for managing
installations and virtual environments with the Anaconda Python distribution
(getting started with conda).
conda
command is to first open an "Anaconda Prompt (Anaconda 3)",
which looks similar to an ordinary command window.
When Anaconda Prompt is opened at first, its prompt should be showing something like "(base) c:Users/userName".
At this point, create a new directory where you will do
experimentation with deep learning models; a good choice for the directory would be C:\Users\userName\Documents\AnacondaProjects.
From the Anaconda Prompt window, navigate to C:\Users\userName\Documents\AnacondaProjects.
In the Anaconda Prompt window, type conda create -n envTF22 python=3.7
(the name "envTF22" stands
for environment based on TensorFlow 2.2. You can replace as per your wish).
Type conda env list
to see the list of all environments that you created within your conda, one of them should be "envTF22".
activate envTF22
to activate the virtual environment (for Linux/macOS
users, type source activate envTF22
). Your command prompt will change to the name of your
environment, and would appear as "(envTF22) :~ $".
conda install pandas numpy scipy pillow matplotlib scikit-learn
in the command window.
Note:
Many tool packages can be installed inside one environment. Two most common installation tools/methods are conda
and pip
,
and most packages can be installed with either of these tools (the choice between "conda" and "pip" may not be obvious).
In this tutorial, we explicitly state what to use (conda or pip), but if you need some particular tools, you will have to examine what is the recommended method to load them.
conda list
,and pip freeze
.
For instructions on how to manage conda environments, see
manage-environments.
This step is optional. This step should not be done for computers without GPUs.
Compute Unified Device Architecture (CUDA) is a parallel computing platform and programming model created by NVIDIA. It harnesses the full power of graphics processing units (GPUs) for deep learning purposes.
This step is optional. This step should not be done for computers without GPUs.
cuDNN is a GPU-accelerated library for deep learning https://developer.nvidia.com/cudnn.
This step requires you to create an NVIDIA account.
If you are familiar with the PATH environment variable setup, you can follow the official instructions provided after you login into the NVIDIA account. Proper PATH setup would make it possible for your computer to automatically locate cuDNN *.dll files.
The alternative (manual) installation instructions are here:
TensorFlow is an open source deep learning framework created and maintained by Google Brain Team https://www.tensorflow.org .
TensorFlow is a rapidly evolving deep learning framework, and new versions are released frequently. As of June 2020, TensorFlow 2.2 has been released, which incorporates new features. official guide. For running your assignments, you will also be asked to use a Google Cloud (gcp) instance, for which the instuctions are shown in another instruction manual; Google Cloud Image that we provided has tensorflow 2.2 (tensorflow-gpu 2.2 in case you have a gpu on your local machine) and thus the same version of TensorFlow should be used on your local mahine.
TensorFlow Installation
Note: Always activate your virtual environment (our suggestion is "envTF22").
TensorFlow 2.2 installation:
pip install tensorflow==2.2
pip install tensorflow-gpu==2.2
Verify TensorFlow installation:
python
.
For TensorFlow 2.2, type the following:
>>>import tensorflow as tf
>>>a = tf.constant('Hello TensorFlow!')
>>>print(a)
# If you see the following output, then you're all set!
b'Hello TensorFlow!'
Jupyter is a web-based Python programming environment, allowing you to edit code, display output results and plots, and show animations. You can even create a finely written report in Jupyter notebook, since it supports Latex grammar. For course assignments, we will require you to use Jupyter to do your work and demonstrate the results.
Jupyter Notebook installation (into conda/Anaconda environment):
Install JupyterLab: JupyterLab provides a web-based user interface which helps with organization of Jupyter projects, including Jupyter notebooks, text editors, terminals, etc.
Note: Make sure that you install it inside your virtual environment (suggested "envTF22"). Jupyterlab package comes distributed with Anaconda 3, but to make it available inside your virtual environment, you need to type
conda install -c conda-forge jupyterlab
Jupyter Notebook is installed by default with JupyterLab. One can then either directly start Jupyter Notebook (*.ipynb) files, or start JupyterLab to open Jupyter Notebook files indirectly ("*.ipynb" comes from the five letters of the older name of Jupyter notebook = ipython notebook).
Open Jupyter Notebook: In your virtual environment, type:
jupyter notebook
or jupyterlab
.
Though JupyterLab can help in better managament of Jupyter projects, it is simpler to use jupyter notebook directly. We leave for you if you wish to explore JupyterLab features by yourself (see this website).
As the figure below shows, first activate
your virtual environment if it is not already activated,
then type jupyter notebook
to start Jupyter notebook - it will open inside your browser.
Now, You can view your Jupyter notebooks inside your browser (Chrome, IE, Safari etc.).
Optional: If you wish to experiment with jupyter notebooks, here are some tutorial links.
http://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/
http://ipywidgets.readthedocs.io/en/latest/examples/Lorenz%20Differential%20Equations.html
PyTorch is another open source machine learning framework for Python, based on Torch. It has been developed by Facebook's artificial-intelligence research group. Compared to TensorFlow, one of PyTorch advantages is the implicit dynamic network design.
PyTorch will not be used in E4040 course.
PyTorch installation: go to the official website http://pytorch.org/ and follow the install instructions, choosing the correct versions of Python and CUDA.
ECBM E4040 Neural Networks and Deep Learning, 2020.
Columbia University