Skip to content

A machine learning approach to investment portfolio composition. The program analyzes the fundamentals of the listed companies on the S&P1500 in order to emit monthly buy signals.

License

Notifications You must be signed in to change notification settings

pieroliviermarquis/SP1500stockPicker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SP1500stockPicker

This project incorporates a random forest algorithm which can emit monthly buy signals from the fundamentals of the companies listed on the S&P1500.

Objective

The objective of this project is to beat the performance of the Standard & Poor 500, the most commonly used benchmark in the finance industry.

Getting Started

All the required code is found in the SP1500stockPicker.ipynb file. You can run it using Jupyter Notebook which comes with the Anaconda distribution which you can find at https://www.anaconda.com/distribution/#download-section. The code is running on Python 3.7.

The data used in the program was obtained from Bloomberg and Compustat. Both services are proprietary and therefore the data cannot be published online.

  • ratios_1990_2019.csv ( File containing all the financial ratios of the US companies, obtained with Compustat)
  • yield_1962_2019.csv (File containing all the monthly returns of the US companies, obtained with Bloomberg)
  • SP1500constituents.csv (File containing all the past and present constituents of the S&P 1500 index, obtained with Compustat)
  • ^GSPC.csv (File containing all the closing price of the S&P 500, obtained via Yahoo Finance)

Requirements

  • python== 3.7

  • scikit-learn==0.21.2

  • pandas==0.24.2

  • matplotlib==3.1.1

  • numpy==1.16.4

Running

The critical functions of the program (merging database and machine learning) use multiprocessing to accelerate the different task. The entire notebook takes about 2.5h to run on a 24 thread computer with 128gb RAM installed. Performance will therefore vary depending on your hardware.

In Jupyter, hit "Run All Cells" to execute the entire script. You will find intermediate steps after the data cleaning and feature selection to skip unecessary work for the CPU when changing components of the code.

Built With

Contributing

Feel free to reach out to me by email at [email protected] for suggestions or questions about the repo.

Authors

  • Thomas Rochefort-Beaudoin - Initial work

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE.md file for details

Acknowledgments

  • Robert Normand, Teacher at Polytechnique Montréal and researcher at CIRANO Montréal, for his insights and help.

About

A machine learning approach to investment portfolio composition. The program analyzes the fundamentals of the listed companies on the S&P1500 in order to emit monthly buy signals.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.1%
  • Python 0.9%