This repository has been archived by the owner on Dec 1, 2024. It is now read-only.
crawl / scrape GitHub topics #53
shinenelson
started this conversation in
General
Replies: 1 comment
-
this could be converted to a technical issue if it is relevant. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
there are 33 public repositories with the terms-of-service topic and 84 public repositories with the privacy-policy topic. This includes companies like GitHub, Basecamp, Unity-Technologies among many others.
Since these are version controlled repositories with plain text files ( mostly markdown ), it may make it lot easier to track changes and source the policy documents verbatim than scraping them off of a website and doing processing on top of all that.
I understand this might introduce some extra technical effort in getting done. The reason I am proposing this was because I was surprised that the Basecamp's Terms of Service was not annotated yet even though they had their policies in a public repository in markdown format.
What would it take for this to get done?
Beta Was this translation helpful? Give feedback.
All reactions