Centralized data analysis platform for the Cornell Innovation and Entrepreneurship Lab. This repository contains scripts for data collection, data cleaning, and data analysis.
- Python 3.9
- pip
- virtualenv
- Cornell Email
- Clone the repository
git clone
- Create a virtual environment
virtualenv venv
- Activate the virtual environment
source venv/bin/activate
- CD into the server repository
cd server
- Install the dependencies
pip install -r requirements.txt
- Create a .env file in the server directory
touch .env
- Add the following environment variables to the .env file
export CORNELL_NETID = "your_cornell_netid"
export CORNELL_PASSWORD = "your_cornell_password"
export CAPITAL_IQ_USERNAME = "your_capital_iq_username"
export CAPITAL_IQ_PASSWORD = "your_capital_iq_password"
- Source the .env file
source .env
- Run the server
python app.py
- Open a new terminal window and CD into the client repository
cd cornell-data
- Install the dependencies
npm install
- Run the client
npm start
The platform could be used to collect companies data in the following ways:
- Collecting data of list of companies from Capital IQ, Mergent Intellect, or Guidestar websites, individually.
cd scraping
python index.py --source
- Collecting data of list of companies from Capital IQ, Mergent Intellect, or Guidestar websites, in bulk.
python index.py