You can star repositories to keep track of projects you find interesting. I have Scraped top stared repositories from GitHub with different topics. I have used Python BeautifulSoup to scrape the data. The main motivation behind this data is to analyze top GitHub stared repositories.
I have selected some topics like Data-Science, Machine-Learning, Computer-Vision, etc. Then I have watched most stared 100 repository details including repository commits, issue, fork, etc.
There are more than 1500 repository nformation.
Data contains the main 19 columns:
- topic: A base word with the help of its fetched repository.
- name: repository name.
- user: repository user name.
- star: stars are given by users.
- fork: number of the fork that specific repository.
- watch: repository watch
- issue: number of issue in that repository.
- pull_requests: number of pull requests
- projects: a number of projects undergoing that topic_tag.
- topic_tag: tag added to the repository by the user.
- discription_text: short discription added by user.
- discription_url: additional url provide by repository.
- commits: number of commits to that repository.
- branches: a number of different branches of the repository.
- packages: number of packages.
- releases: releases of the repository.
- contributors: a number of users have contributed to the repository.
- License: name of License.
- url: URL of the repository.
current repository topics: Data-Science, Machine-Learning, Open-CV, Computer-Vision, GAN, variational-encoder, Android-studio, flutter, JAVA, awesome, javascript, c++
stay tuned for more topics.
📘 Kaggle Kernel click here
- we can see above machine-learning and deeplearning tags are used more then 200 times
- we can see here javascript topic has most stared repository (total 4M+)
- also we know that machie-learning is most usable tags rather then machine-learning repository are not stared as much as java-script
- Android-studio, opencv, sensor,variational-encoder are thos topic which accumulated sum are very low
- Data-science,computer-vision is most hot topic these days but data-science repositerys are not as much stared compared to oter topics
- we can see here similarity between repository star and forks
- java-script's repository are most stared and forked also
- star and watch are more correlatd (0.9)
- star and fork is less correlated compare to other two
- (Star & watch) > (Fork & watch) > (star > fork)
- Defenetly Raspberry-pi topic have less star and forks but it wins in commits
- Java-Script again win the game
- Data-Science and c++ topics are came forward. they are not at 2nd and 3rd place in star and fork repository