- pandas
- numpy
- tensorflow
- sklearn
- statsmodels
- xgboost
- matplotlib
- Jupyter
- Python3
- It is a iPython notebook, can be run directly in jupyter or colab. Just have to run Project.ipynb file
- To run from command line, convert iPython notebook to python script to run the program, using below commands
jupyter nbconvert --to python Project.ipynb
python3 Project.py
- 'Historical Data' directory, 'Processed Data' directory, 'Project.ipynb', 'layoffs.csv' all of them should be in the same directory. ('Layoffs' in my case)
- Datasets required are 'Historical Data' directory (contains historical data as csv files of all companies) and 'layoffs.csv' in the 'Project' directory file. Both are submitted in the same zip file.
- Directory with the name 'Processed Data' should be present in the 'Project' directory, to store the preprocessed data. 'Processed Data' directory is also submitted (replaces the csv files inside 'Processed Data' directory everytime we run the code for a particular company).
- In 2nd cell of the notebook, company name as a string should be assigned to company_name. Deafult is 'Cisco'
- 'Cisco', 'IBM', 'Salesforce' are some of the companies. Entire list of companies is present in the 3rd cell
- In 2nd cell of the notebook, boolean variable close_price_as_predictive_column should be set to True if we want to test with close price as a predictive column, False for direction change as a predictive column,. Default is False
- 4th cell does the granger causality tests between layoffs and close price and displays results.
- 6th cell is testing with Xgboost regression model, and displays RMSE and R2 Score. Output column is always close price irrespective of close_price_as_predictive_column for this regression model
- Preprocessed data is printed in the 7th cell for a selected company
- 11th cell is training the LSTM model.
- 13th cell is predicting the output column using the trained lstm model for the test data, and it displays 'mean squared error'
- 14th cell/last cell prints the classification report i.e precision, recall, and accuracy if output column is 'direction_change'. If output column is close price, it displays the graph of actual stock price, and predicted stock price for test data
- Project.pdf is also submitted, screenshot/snapshot of all cells along with outputs with default values