Setting Up Your Environment

Here are some resources for more information about topics covered in this lesson, including where to find the Anaconda Individual Edition and how to use venv on Windows:

00:00 Before we get going, I want to make sure you’ve got your environment all set up for this course. There’s a couple of ways that you can go about this if you’ve never used Jupyter before. From the terminal, what you could do is first create a virtual environment that will contain all of the modules that you need for the specific project. This is, in general, a good way to do things so that you’re not creating any conflicts with modules that maybe you need for one project and the modules that you need for this project.

00:29 You would use the virtual environment module in Python and create a folder inside your project folder which you can call venv or whichever name you want to give it.

00:42 This folder will contain a Python installation and any of the modules that we install for this course will be saved in this folder. Once you’ve created the virtual environment, you’ll go ahead and activate it with the second line and then use pip to install all of the modules that we need, which can be listed in a requirements.txt file.

01:07 And then once that completes, all you do is from the terminal type jupyter notebook, and that will fire up a Jupyter server and an instance of Jupyter in your browser.

01:19 Now, in the requirements.txt file, in each line you simply type in what are the modules that you need. For this particular course we’re going to need numpy, scipy, matplotlib, pandas, and of course jupyter.

01:33 So just create a simple text file, in each line write down these module names, and that’s the file that needs to be also contained in the project folder.

01:43 So, that’s one way to get going with all of the modules that you need for the course. Another way is to use the Anaconda distribution. The Anaconda distribution is a Python distribution that comes pre-bundled with a whole bunch of tools and modules for data science.

02:02 There’s an open-source version that can be used for individuals, so you can head on over to anaconda.com and check out the Individual version. The Individual Edition comes pre-bundled with all of the modules that you would need for data science, like NumPy, pandas, SciPy, Matplotlib, and it also comes with Jupyter Notebook, JupyterLab, and a nice IDE called Spyder that I highly recommend.

02:29 Let me quickly go over this for you so that you just have everything to go. I’ll start off first with the terminal version of getting everything installed and then I’ll go over Anaconda. I’m on a Mac machine, and so if you’re using Windows, the instructions that I’m going to give here are going to look a little bit different, so check out the video description on how to go about this if you’re on the Windows platform.

02:54 I’m at the terminal prompt and I’m located in my home directory in the gradebook project, and inside this folder, all I’ve got is the data files, which contain all of the CSV files, the quiz files, the homework exam grades, and the roster CSV file.

03:13 I’m going to create a Python virtual environment so that I can install all of the modules that I need. I went ahead and I created this requirements.txt file and it contains all of the modules that I’ll need.

03:28 I also threw in ipython, which is a interactive REPL for Python. We’re going to need numpy, scipy, matplotlib, jupyter, and pandas.

03:38 What we need to do first is create a virtual environment, and so I’ll do that with the virtual environment module. I’m going to create a folder which I’ll call venv in the gradebook_project/ folder, and this will contain my Python installation, which I will later activate so that we can have a self-contained Python installation for this project. So, go ahead and run that.

04:06 It’ll take just a few seconds.

04:09 Now let’s activate this virtual environment, so we need to navigate to the bin/ directory, which contains the activate script. So now, any module that I install will be contained in this virtual environment folder that we created, which is contained now in the gradebook_project/ folder.

04:29 Now let’s use pip to install all of the modules that we listed in the requirements file. This will take probably just five, ten seconds, depending on your connection and CPU.

04:44 Go ahead and run that.

04:48 We’ll start downloading all of the modules and installing them in this virtual environment folder.

04:57 All right, so that completed. Let’s clear that up. And now we have Jupyter installed, so let’s run that with jupyter notebook.

05:09 Now we have a Jupyter server running, and so let’s go ahead and create a new Notebook using Python 3.

05:18 Let’s go ahead and save this

05:22 as the gradebook Notebook,

05:27 and then we’re all set to go.

05:30 If you’re not interested in setting everything up via the terminal, or if you prefer to have a complete data science Python installation, then I highly recommend the Anaconda distribution.

05:41 If you head on over to the Anaconda website, click on the Individual Edition, and if you scroll all the way down or click on the Download button, you’re taken to the different installers and depending on whether you’re on a Windows, Mac, or Linux machine, download the graphical installer—probably the easiest way to go—and then install Anaconda that way.

06:05 Once the installer completes, you’ll open up what’s called the Anaconda Navigator, which looks like this. And it comes, as I said, with a whole bunch of different packages and modules and tools, like, for example, JupyterLab, which is sort of the new version of the Jupyter Notebook.

06:25 And we’ve got, of course, Jupyter Notebook and Spyder. If you launch Jupyter Notebook that way, you’ll get again a server running and the Jupyter Notebook will open up in your browser.

06:38 So if you’re new to data science and Python in general, and maybe you have experience with other packages and other programs, then I recommend Anaconda and you’ll find that you’ve got most of the modules that you need to get going with Python and data science in general.

06:54 But once you launch the Jupyter Notebook, you’ll be all set to go, just like we had it set up with the terminal version. All right! Let’s get going and start coding.

Become a Member to join the conversation.