Skip to content

Commit

Permalink
Rearrange lesson 1 projects to match slide order
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewbaxter authored Jan 30, 2019
1 parent dd713ce commit c9932af
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions unit1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,16 @@ This unit covers the basics of web scraping with a special focus on data extract

## Hands-on

#### 1. Reddit spider
Build a spider to extract `title`, `link`, `username`, `user_url`, `score` and `time` from each submission in the front page of reddit's [/r/programming](http://reddit.com/r/programming) and [/r/python](http://reddit.com/r/python).

[Check out the spider **once you're done**.](spiders/spider_5_reddit.py)

#### 2. Books spider
#### 1. Books spider
Build a spider for [books.toscrape.com](http://books.toscrape.com) that extracts `title`, `rating`, `price`, `stock` and `category` from the URLs listed in [this file](spiders/urls.txt) (it can be stored locally alongside your spider).

[Check out the spider **once you're done**.](spiders/spider_6_books.py)

#### 2. Reddit spider
Build a spider to extract `title`, `link`, `username`, `user_url`, `score` and `time` from each submission in the front page of reddit's [/r/programming](http://reddit.com/r/programming) and [/r/python](http://reddit.com/r/python).

[Check out the spider **once you're done**.](spiders/spider_5_reddit.py)

## References
* [Scrapy Tutorial](https://doc.scrapy.org/en/latest/intro/tutorial.html)
* [Parsel (the extraction library behind Scrapy) documentation](https://parsel.readthedocs.io/en/latest/usage.html#getting-started)
Expand Down

0 comments on commit c9932af

Please sign in to comment.