Skip to content

Lists syntactic patterns of HTTP user-agents used by bots/robots/crawlers/spiders (pull-request welcome). if you like it, star it ⭐⭐⭐

License

Notifications You must be signed in to change notification settings

genderkit/crawler-user-agents

 
 

Repository files navigation

The Github repository crawler-user-agents contains a list of of HTTP user-agents used by robots/crawlers/spiders. I regularly maintain this list based on my own logs. I do welcome additions contributed as pull requests.

The pull requests should:

  • contain a single addition
  • specify a discriminant relevant syntactic fragment (for example "totobot" and not "Mozilla/5 totobot v20131212.alpha1")
  • contain the pattern (generic regular expression), the discovery date (year/month/day) and the official url of the robot
  • result in a valid JSON file (don't forget the comma between items)

Example:

{
  "pattern": "rogerbot",
  "addition_date": "2014/02/28",
  "url": "http://moz.com/help/pro/what-is-rogerbot-"
}

The list is under a MIT License. The versions prior to Nov 7, 2016 were under a CC-SA license.

--Martin

About

Lists syntactic patterns of HTTP user-agents used by bots/robots/crawlers/spiders (pull-request welcome). if you like it, star it ⭐⭐⭐

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%