This project is about web crawling. It will use a keyword (e.g. SAP) to search hiring positions and their companies on 104 human resource bank and store them as two .csv file respectively.
-
python web_crawler_final.py
-
python company_count.py
- web_crawler_final.py
The program will load html scripts using beautiful soup package. Then, it will extract SAP positions' data and then search for the hiring companies web page to get companies' data.
- company_count.py
This is optional. It will rank the companies by the number of opening positions in the .csv file.
Note: save the .csv file as csv utf-8 in order to work.
Ethan Wang