Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Working With Time Series

If you’re following along with this lesson and not using the provided Jupyter Notebook from this course’s supporting materials, you can copy-paste the following temp_c list:

Python
temp_c = [ 8.0,  7.1,  6.8,  6.4,  6.0,  5.4,  4.8,  5.0,
           9.1, 12.8, 15.3, 19.1, 21.2, 22.1, 22.4, 23.1,
          21.0, 17.9, 15.5, 14.4, 11.9, 11.0, 10.2,  9.1]

00:00 One of the main thrusts for creating the pandas module was to work with time-series data. To showcase some of the ways that you can work with time-series data in pandas, we’re going to create a pandas DataFrame using the hourly temperature data from a single day. To do this, let’s define a list, which we’ll call temp_c, and the data that I’m going to paste will be included in the video description.

00:27 This data contains temperature measurements taken at one-hour intervals over a 24-hour period in Celsius.

00:37 So, go ahead and run that. The DataFrame that we’re going to create is going to contain as its index a datetime index. And to do this, we’re going to use the date_range() function.

00:49 date_range() takes on several keyword arguments. One of them is the start keyword argument, which can be used to specify the left bound of the date range.

01:01 Let’s go ahead and put the year 2019 and it’s going to be, say, October 27th. We’re going to be using the ISO 8601 datetime format, year, month, and then the day. And then the hour, we’re going to start at 12:00 AM in the morning.

01:22 This format for the time is going to be the hours, the minutes, and the seconds, and all of this is over a 24-hour time format. Then we’re going to want to have this datetime range over a period of 24 hours, and the frequency is going to be in hours.

01:41 So this date range is going to serve as our index for the DataFrame that we’re going to construct. So why don’t we save this, say, in a variable dt.

01:49 Let’s go ahead and run that, and let’s take a look at the type of this date range object, and we see that what we get is a DatetimeIndex. This is one of the Index types that exists in pandas—similar, for example, to the RangeIndex that we’ve seen before. All right, so now that we have the index as a DatetimeIndex object and we also have the data, we can go ahead and create our pandas DataFrame.

02:16 We’re going to call this pandas DataFrame temp. Let’s call the constructor. The data is going to be this temperature data.

02:25 The name of the column will be, say, 'temp_c', and the data is in this list that we created. And then, lastly, the index is this DatetimeIndex that we just created.

02:37 Let’s go ahead and run that. Let’s take a look at this temperature DataFrame. We’ve got our column of temperatures in Celsius. And then again, the index is this DatetimeIndex. It starts off at 12:00 AM on 2019, October 27th, and runs all the way up to 11:00 PM on the same day.

03:01 And that’s all there is to it to creating a DataFrame with a time-series data and a datetime row index. In the next lesson, you’re going to see how you can conveniently apply slicing techniques to get just part of the information of a pandas DataFrame when you’re working with time-series data and we’ll also take a look at the resampling method.

Become a Member to join the conversation.