This project contains a Jupyter Notebook for calculating various metrics from horse racing data.
My goal is to use the data available in horse racing to find trends. I'm looking for other individuals like me who are just learning Python and are learning how to navigate GitHub.
In the real world, I'm a racehorse breeder, owner, and trainer. No, I am not rich. I'm a struggling artist just like a musician, an actor, or a philosophy major. You can buy a horse to race fairly inexpensively, but the upkeep of a horse is pretty expensive. Several people own horses as pets; I just race mine.
That said, I'm recovering from a leg injury and have used my downtime to learn Python and GitHub. I love data and I love horse racing, and I'd like to merge the two subjects together.
There are few other sports disciplines that have accumulated as much data as horse racing. Thoroughbred horse racing began 300 years ago, and they have been accumulating data since the beginning. My first goal is to use Python to put this data in a format that can be manipulated and examined. I believe this process is called Data Mining in the field of Data Science.
Equibase is the website I go to get my data on horse racing. They have provided me with a file that has all the data for every racetrack in 2023. The data is in XML format. I have used Jupyter Notebook to try and develop a program to read this data and put the data in a spreadsheet or "dataframe" format. I only did one file so far as an experiment.
- Clone the repository:
git clone https://github.com/jackratatyHorse-Racing-Calculator.git