Data Analysis and fitting a Linear Regression Model on a football dataset using Python

Static Jupyter Notebook : link
Interactive Binder : link

In this project, we are going to perform some simple Data Analysis with the footballing dataset in hand, and implement a Linear Regression Model using scikit-learn.

Requirements and Dependencies

Dataset : Kaggle link Github link

The dataset contains records of 42,000 + international football games. The available information includes the participant teams, goals scored by each team, date, venue etc.

Dependencies:
numpy : 1.20.2
plotly : 4.14.3
matplotlib : 3.4.2
pandas : 1.2.3
chart_studio : 1.1.0
scipy : 1.6.3
scikit-learn : 0.24.2
sklearn : 0.0
threadpoolctl: 2.1.0
joblib : 1.0.1

NOTE : If you are using the Binder link, it will automatically recreate the environment and download these dependencies.
Also, the last two cells(involving the watermark library), might not work in Binder. They aren't a part of the Data Analysis, and are just there to find out the dependencies used in the project. However if you do wish to run them, then add

%pip install watermark

before the cells to install the library.

The Project

On the basis of the available data, we try to find out the following :

In which Tournament was the highest number of games played?
Which are the top 10 teams with the most wins?
Which are the best teams with respect to Winning percentage (having played a minimum of 100 games)?
Calculating Total Goals scored, and find out the top teams w.r.t. Goals scored per game.
Check for any correlation between the Winning %, and Goals per game.

Finally, we fit a Simple Linear Regression Model on the Winning % and Goals scored per game, and find out the accuracy of the prediction by the machine learning model.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
football.ipynb		football.ipynb
requirements.txt		requirements.txt
results.csv		results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis and fitting a Linear Regression Model on a football dataset using Python

Requirements and Dependencies

The Project

About

Releases

Packages

Languages

pillaikartik10/python-football-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Data Analysis and fitting a Linear Regression Model on a football dataset using Python

Requirements and Dependencies

The Project

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages