8/30/2022 | 2022 Fall - At the begining of a semester, access to course material is provided to all Columbia lionmail students, via liondrive which can be reached from the Courseworks site. The Courseworks site is visible to all Columbia students until the add/drop period. A week after the Columbia SEAS add/drop date, access to the course material is given only to those students who are fully registered into the course. From that time, the announcements will be made via the Columbia courseworks announcement system (rather than on this page). This page will be retired at the end of September. |
8/30/2022 | This course has the A-F grading option only. It does not have P/F, R, or formal or informal audit options. Students who have not made progress working on the Assignment 0 several days after the semester start date are advised to drop the class asap. Google cloud coupons (credit) will be distributed to registered students after the add/drop date. |
8/30/2022 | To register into the course all students first need to get onto the Columbia SSOL waitlist. The students will be moved from the waitlist into the registered list by mid September, in priority chosen by the instructor. Students with *cumc* account will need to send an email to e4040TAs@columbia.edu with subject "E4040 student - cumc account", informing us of the fact that you are a "cumc" student. This will require special accommodation. |
8/30/2022 | Additional course information such as typical syllabus can be found here. |
9/27/2021 | 2021 Fall Final Announcements - Courseworks and liondrive have been restricted to registered students only. These webpages still contain material useful for environment setup, tools and tutorials. |
9/16/2021 | 2021 Fall - Assignment 0 has been published - see under the assignment section of Columbia Courseworks. |
9/16/2021 | 2021 Fall - The first recitation will be TBD. Zoom will be scheduled in the Courseworks. Topics: Assignment 0, tools, python, numPy. Jupyter Notebooks for recitations have been uploaded into the google drive folder with recitations. To prepare for the recitation, download and run Assignment 0 in the Google Colab, try the Google Cloud, and download the recitation notebooks. The recitations will be interactive, try to be ready to run jupyter notebooks. |
Common email: e4040TAs@columbia.edu.
TA names, office hours (location and time) are announced in the courseworks.
Students will contribute to the course by:
Homeworks/assignments (40%), (one) Exam(25%), Project (35%), quizzes and activity may be used in 2020 (TBD).
Late homeworks (assignments): A student is entitled to 4 late days without penalty. For all homeworks together, a student can divide those four days in any fashion needed. Examples: (i) Homework 3 is late 4 days, in which case no other homework can be late for any amount of time; (ii) Homework 1 is late 1 day, homework 2 is late 2 days, in which case the student still has one more late-day for future assignments. The unit of delay can not be divided into less than a full day (like hours). Requests for additional extensions will not be granted: if the budget of 4 days is blown, the student will be given 0 credit for homework(s) for which their submission is late.
Computational tools are essential for learning about, designing, and experimenting with deep learning models. At the top level, deep learning developers use one of the deep learning frameworks to build and run models, which rely on a myriad of either generic or custom software libraries. This course will use a particular version of the TensorFlow framework. The underlying computing language is Python, and the course will rely on Python code/libraries and Jupyter Notebooks for developing and experimenting with code. The students will be asked to run the tools and deep learning models in the Google Cloud Platform (GCP), and in some cases in Google COLAB. For their personal use, the students can use their own (local/personal) computers. Versioning of the tools is complex: (i) instructors will provide a GCP image with preinstalled tools and (ii) if using personal computers, students need to follow the precise tool version instructions in order to have functional computing environments. This course relies on Conda and PIP for tool versioning.
For more complex assignments, students should run deep learning code on Google Compute Engine (GCE) virtual machine (VM) instances, with GPU resources. The instructors will provide coupons (codes) which will enable students to use Google cloud resources for free. This tutorial provides the instructions how to setup a GCE/GCP VM instance using the custom VM image created by the instructors.
This section is linked to the official Python tutorial. There are many other excellent resources that can be used as well. The course uses python 3.7 and Anaconda 3 for programming assignments (TODO check for 2022Fall). In the shared Google cloud image, the instructors installed the Python 3.7 version (TODO check for 2022Fall). If you choose to use your personal computers, you should use exactly the same version of all tools as described in the section on GCE setup.
As of 2022 the course does not support local (PC/laptop) environments any more. All assignments must be done in GCP/GCE. The rest of information in this section is purely informational. For assignment submissions, students must use the methodology specified by the assignments, which is (as of 2022 fall) a combination of Google COLAB and Google Compute Engine (GCE) Setup on the Google Cloud Platform (GCP). For their own personal use, students could use their personal/local computers to run python code, jupyter notebooks, TensorFlow framework and deep learning models. This is possible both with computers which have graphics processing units (GPUs) and with computers which do not have GPUs. The execution will be faster on computers with GPUs. The begining assignments in this course are runnable on single processor (non-GPU) computers. Later assignments and projects will require that students run them on machines with GPUs; the assignments must be run on the Google Cloud GPU machines. We will provide coupons for Google cloud (GCP) access. Clicking on this section leads to instructions on how to install tools for personal computers. The instructors will have no time for debugging of local computer tool installations - the required computing platform for assignments is the Google Cloud Platform (GCP).
TensorFlow is an open-source deep learning framework developed by Google. It is the framework of choice for this course. Note that, as of 2020 this course uses versions 2.2.0 (TODO update for 2022 Fall) of TensorFlow, although the most recent TensorFlow homepage(s) may refer to a more recent version. For 2022 assignments, students have to use the course-prescribed versions of TensorFlow and Python. If students wish to experiment with other versions of tools, they are advised to create additional (conda) virtual environments to prevent conflicts with tool versions used for the assignments.
The GCP VM instances used in this course are Linux-based. This tutorial offers hands-on instructions for basic Linux commands.
Homework assignments will be using Github repositories and Github Classroom distribution of assignments. Students need to be comfortable with Github, as well as git commands or visual git clients such as GitKraken. For beginners, it is strongly recommended that you read the first 3 chapters of this website. Additional instructions for use of assignment-related github repositories will be provided during the course.
Colab provides a free cloud service based on Jupyter Notebooks. It supports free GPUs with popular deep learning libraries with an execution/data-storage time limit. It is a good platform for developing and testing the deep learning code, which avoids the issues of tool installation. However, because of time limitations, we do not recommend that you do the assignments on Colab - everything except Jupyter Notebooks would be automatically deleted every 12 hours. We will use Colab ocassionally for quick sharing of jupyter notebooks and running of simple models, but we request that students set up their google cloud resources (GCE) to use as the primary computing environment. Personal (local) computing environments can be used as well, with careful management of tool versions.
Solutions to some common problems
ECBM E4040 Neural Networks and Deep Learning
Columbia University