Skip to content

Commit

Permalink
Merge pull request #1018 from honzajavorek/honzajavorek/better-video-…
Browse files Browse the repository at this point in the history
…text

fix: mention the video is outdated, but explain it's still valuable
  • Loading branch information
honzajavorek authored May 22, 2024
2 parents eb6ef79 + 3eddcda commit 851f6b5
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion sources/academy/webscraping/anti_scraping/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,16 @@ A common workflow of a website after it has detected a bot goes as follows:

One thing to keep in mind while navigating through this course is that advanced scraping methods are able to identify non-humans not only by one value (such as a single header value, or IP address), but are able to identify them through more complex things such as header combinations.

A conference talk by [Ondra Urban](https://github.com/mnmkng) will guide you through various anti-scraping measures and how to get around them.
Watch a conference talk by [Ondra Urban](https://github.com/mnmkng), which provides an overview of various anti-scraping measures and tactics for circumventing them.

<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/aXil0K-M-Vs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

:::info Several years old?

Although the talk, given in 2021, features some outdated code examples, it still serves well as a general overview.

:::

## Common anti-scraping measures {#common-measures}

Because we here at Apify scrape for a living, we have discovered many popular and niche anti-scraping techniques. We've compiled them into a short and comprehensible list here to help understand the roadblocks before this course teaches you how to get around them.
Expand Down

0 comments on commit 851f6b5

Please sign in to comment.