Python Web Scraping
Learning Path ⋅ Skills: Web Scraping, HTTP Requests, Data Parsing
Web scraping is about downloading structured data from the Web, selecting some of that data, and passing along what you selected to another process. With this learning path, you’ll learn the core Python technologies and skills that you need to build your own web scraper.
Python Web Scraping
Learning Path ⋅ 9 Resources
Laying the Foundation for Web Scraping
Before you jump into web scraping, it’s important to brush up on some foundational skills, like making HTTP requests and understanding HTML and CSS.
Python's urllib.request for HTTP Requests
In this tutorial, you'll be making HTTP requests with Python's built-in urllib.request. You'll try out examples and review common errors encountered, all while learning more about HTTP requests and Python in general.
Making HTTP Requests With Python
The requests library is the de facto standard for making HTTP requests in Python. It abstracts the complexities of making requests behind a beautiful, simple API so that you can focus on interacting with services and consuming data in your application. This course shows you how to work effectively with requests, from start to finish.
HTML and CSS for Python Developers
There's no way around HTML and CSS when you want to build web apps. Even if you're not aiming to become a web developer, knowing the basics of HTML and CSS will help you understand the Web better. In this tutorial, you'll get an introduction to HTML and CSS for Python programmers.
Getting Started With Web Scraping
Now that you’ve learned some foundational skills, you’re ready to start web scraping!
Web Scraping in Python: Tools, Techniques, and Legality
Do you want to get started with web scraping using Python? Are you concerned about the potential legal implications? What are the tools required and what are some of the best practices? This week on the show we have Kimberly Fessel to discuss her excellent tutorial created for PyCon 2020 online titled "It's Officially Legal so Let's Scrape the Web."
Web Scraping With Beautiful Soup and Python
In this course, you'll walk through the main steps of the web scraping process. You'll learn how to write a script that uses Python's requests library to scrape data from a website. You'll also use Beautiful Soup to extract the specific pieces of information that you're interested in.
A Practical Introduction to Web Scraping in Python
Learn all about web scraping in Python. You'll see how to parse data from websites and interact with HTML forms using tools such as Beautiful Soup and MechanicalSoup.
Handling Response Data
In web scraping, you end up with lots of response data. Next up, you’ll learn what to do with it.
Working With JSON Data in Python
Learn how to work with Python's built-in json module to serialize the data in your programs into JSON format. Then, you'll deserialize some JSON from an online API and convert it into Python objects.
Reading and Writing CSV Files
This short course covers how to read and write data to CSV files using Python's built in csv module and the pandas library. You'll learn how to handle standard and non-standard data such as CSV files without headers, or files containing delimeters in the data.
Automating Your Web Scraping Process
Finally, you’ll learn how to use a headless browser to automate the web scraping process.
Modern Web Automation With Python and Selenium
Your guide to learning advanced Python web automation techniques: Selenium, headless browsing, exporting scraped data to CSV, and wrapping your scraping code in a Python class.
Congratulations on completing this learning path! If you’d like to continue to develop your skills for interacting with web data, then check out the web scraping topic on Real Python.
Or maybe you’d like to explore different ways to organize and work with a variety of data. In that case, these learning paths have got you covered:
Got feedback on this learning path?
Looking for real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!