Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next AIrsenal release [develop branch] #645

Merged
merged 52 commits into from
Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
926a131
type hints
rchan26 Dec 30, 2022
5084368
get appropriate results_dict to fit neutral model
rchan26 Dec 31, 2022
17d8491
look at feature/neutral-dixon-coles branch of bpl
rchan26 Dec 31, 2022
b65453f
update functions that use team model
rchan26 Dec 31, 2022
45b85e1
fix Dict type-hint
rchan26 Dec 31, 2022
d356d98
fix Dict type-hint when nested
rchan26 Dec 31, 2022
bd00417
update tests and end_date retrieval bug
rchan26 Dec 31, 2022
6dcbc2f
change to check team_model type when computing outcome prob
rchan26 Dec 31, 2022
a45c5ff
type-hints for utils
rchan26 Dec 31, 2022
0f18788
finish type-hints for utils
rchan26 Jan 1, 2023
2158d22
type-hints for prediction utils
rchan26 Jan 1, 2023
8aa118b
allow for other player model class in type-hints
rchan26 Jan 1, 2023
d0c6e0c
fix choices option in run_pipeline cli
rchan26 Jan 3, 2023
1d9fb90
add time_diff to training data
radka-j Jun 15, 2023
3c674c2
time_diff is number of days from max results date
radka-j Jun 15, 2023
6b06f18
add ratings as optional parameter to add_new_teams_to_model
radka-j Jun 15, 2023
d2b29d1
Merge branch 'develop' into add_time_diff_to_data
radka-j Jun 15, 2023
91bbb14
pass time_decay parameter to model.fit() method
radka-j Jun 15, 2023
1f018d7
lots of merge conflicts
rchan26 Jul 13, 2023
a0f1945
Merge branch 'update-team-model' into add_time_diff_to_data
rchan26 Jul 13, 2023
3b1b496
apply lint2
rchan26 Jul 13, 2023
7170d7b
Merge pull request #591 from alan-turing-institute/add_time_diff_to_data
rchan26 Jul 13, 2023
2dff338
remove 'model' arg and just use 'model_class' in create_and_fit_team_…
rchan26 Jul 13, 2023
d725173
bug fixes for fitting neutral model
rchan26 Jul 13, 2023
3429ad1
Merge branch 'develop' into update-team-model
rchan26 Aug 7, 2023
5331ee7
update tests
rchan26 Aug 7, 2023
37a4d9d
Merge branch 'develop' into update-team-model
rchan26 Aug 7, 2023
e119e5b
notebook using new model fitting update
rchan26 Aug 7, 2023
163b08c
update CLI to pass in model and epsilon properly
rchan26 Aug 7, 2023
f9f45b0
do not pass in team_model and model arg to fixture_probabilities
rchan26 Aug 7, 2023
df1fd6f
fit requested model if not fit
rchan26 Aug 7, 2023
567644a
Merge branch 'develop' into update-team-model
jack89roberts Aug 12, 2023
766cbb8
Give --max_transfers option to airsenal_run_pipeline.
iansealy Sep 28, 2023
415099d
Limit allowable values for --max_transfers.
iansealy Sep 28, 2023
705947e
Add missing trailing comma.
iansealy Oct 6, 2023
77e711d
Give --max_hit and --allow_unused options to airsenal_run_pipeline.
iansealy Oct 6, 2023
992317b
Fix long line.
iansealy Oct 6, 2023
1453b6d
Merge pull request #637 from iansealy/max-transfers
jack89roberts Oct 6, 2023
2af6645
add function to parse team model from str and pass in modelinstance
rchan26 Oct 20, 2023
9ad9c5b
Merge branch 'main' into update-team-model
rchan26 Oct 20, 2023
3d00d7f
import models from bpl rather than airsenal.bpl_interface
rchan26 Oct 20, 2023
5c95e6b
call the model instances
rchan26 Oct 20, 2023
f9ecbd1
change 'xdc' to 'extended'
rchan26 Oct 24, 2023
baf62d9
Merge branch 'develop' into update-team-model
rchan26 Oct 24, 2023
3a9faf0
change wording to 'using the default epsilon'
rchan26 Oct 31, 2023
00c8b17
change back to use ratings by default
rchan26 Oct 31, 2023
8d404d4
Merge pull request #561 from alan-turing-institute/update-team-model
jack89roberts Nov 10, 2023
411de75
:bug: print team ID after it has been set in fetcher
jack89roberts Nov 12, 2023
bf98aee
update dependencies
jack89roberts Feb 24, 2024
7cce49c
update conda env & actions for python 3.12
jack89roberts Feb 24, 2024
3acd3a1
Update .pre-commit-config.yaml
jack89roberts Feb 24, 2024
fc56846
remove a comma
jack89roberts Feb 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11"]
python-version: ["3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v3
Expand Down
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,21 @@
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
rev: v4.5.0
hooks:
- id: check-yaml
- id: check-toml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/pycqa/isort
rev: 5.12.0
rev: 5.13.2
hooks:
- id: isort
- repo: https://github.com/ambv/black
rev: 23.7.0
rev: 24.2.0
hooks:
- id: black
- repo: https://github.com/pycqa/flake8
rev: 6.1.0
rev: 7.0.0
hooks:
- id: flake8
1 change: 1 addition & 0 deletions airsenal/framework/api_utils.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Functions used by the AIrsenal API
"""

from flask import jsonify
from sqlalchemy.orm import scoped_session

Expand Down
2 changes: 1 addition & 1 deletion airsenal/framework/aws_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def get_league_standings_string():
output_string += f"Standings for league {league_name} :"
for i, entry in enumerate(standings):
output_string += (
f"{i + 1,}: "
f"{i + 1}: "
f"{entry['name']}, "
f"managed by {entry['manager']}, "
f"with {entry['points']} points, "
Expand Down
196 changes: 160 additions & 36 deletions airsenal/framework/bpl_interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,15 @@
Interface to the NumPyro team model in bpl-next:
https://github.com/anguswilliams91/bpl-next
"""

from typing import Dict, List, Optional, Union

import numpy as np
import pandas as pd
from bpl import ExtendedDixonColesMatchPredictor
from bpl import ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor
from sqlalchemy.orm.session import Session

from airsenal.framework.schema import FifaTeamRating, Result, session
from airsenal.framework.schema import FifaTeamRating, Fixture, Result, session
from airsenal.framework.season import CURRENT_SEASON, get_teams_for_season
from airsenal.framework.utils import (
get_fixture_teams,
Expand All @@ -17,9 +21,11 @@
np.random.seed(42)


def get_result_dict(season, gameweek, dbsession):
def get_result_dict(
season: str, gameweek: int, dbsession: Session
) -> Dict[str, np.array]:
"""
query the match table and put results into pandas dataframe,
Query the match table and put results into pandas dataframe,
to train the team-level model.
"""
results = [
Expand All @@ -33,17 +39,34 @@ def get_result_dict(season, gameweek, dbsession):
next_gameweek=gameweek,
)
]
# compute the time difference for each fixture in results
# to the first fixture of the next gameweek
result_dates = np.array(
[pd.Timestamp(r.fixture.date).replace(tzinfo=None) for r in results]
)
end_date = pd.to_datetime(
[f.date for f in get_fixtures_for_gameweek(gameweek, season, dbsession)]
).min()
end_date = end_date.replace(tzinfo=None)
time_diff = (end_date - result_dates) / pd.Timedelta(days=365)
return {
"home_team": np.array([r.fixture.home_team for r in results]),
"away_team": np.array([r.fixture.away_team for r in results]),
"home_goals": np.array([r.home_score for r in results]),
"away_goals": np.array([r.away_score for r in results]),
"time_diff": time_diff,
"neutral_venue": np.zeros(len(results)),
"time_diff": time_diff,
"game_weights": np.ones(len(results)),
}


def get_ratings_dict(season, teams, dbsession):
"""Create a dataframe containing the fifa team ratings."""

def get_ratings_dict(
season: str, teams: List[str], dbsession: Session
) -> Dict[str, np.array]:
"""
Create a dataframe containing the fifa team ratings.
"""
ratings = dbsession.query(FifaTeamRating).filter_by(season=season).all()
if len(ratings) == 0:
raise ValueError(f"No FIFA ratings found for season {season}")
Expand All @@ -63,70 +86,163 @@ def get_ratings_dict(season, teams, dbsession):
return ratings_dict


def get_training_data(season, gameweek, dbsession, ratings=True):
def get_training_data(
season: str,
gameweek: int,
dbsession: Session,
ratings: bool = True,
):
"""Get training data for team model, optionally including FIFA ratings
as covariates if ratings is True. Data returned is for all matches up
to specified gameweek and season.
as covariates if ratings is True. If time_decay is None, do not include
exponential time decay in model.
Data returned is for all matches up to specified gameweek and season.
"""
training_data = get_result_dict(season, gameweek, dbsession)
if ratings:
teams = set(training_data["home_team"]) | set(training_data["away_team"])
training_data["team_covariates"] = get_ratings_dict(season, teams, dbsession)
training_data["team_covariates"] = get_ratings_dict(
season=season, teams=teams, dbsession=dbsession
)
return training_data


def create_and_fit_team_model(
training_data, model_class=ExtendedDixonColesMatchPredictor
):
training_data: dict,
model: Union[
ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor
] = ExtendedDixonColesMatchPredictor(),
**fit_args,
) -> Union[ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor]:
"""
Get the team-level stan model, which can give probabilities of
each potential scoreline in a given fixture.
"""
return model_class().fit(training_data)
if not fit_args:
fit_args = {}
if "epsilon" in fit_args:
print(f"Fitting {type(model)} model with epsilon = {fit_args['epsilon']}")
else:
print(
f"Fitting {type(model)} model but no epsilon passed, "
"so using the default epsilon = 0"
)

return model.fit(training_data=training_data, **fit_args)

def add_new_teams_to_model(team_model, season, dbsession):

def add_new_teams_to_model(
team_model: Union[
ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor
],
season: str,
dbsession: Session,
ratings: bool = True,
) -> Union[ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor]:
"""
Add teams that we don't have previous results for (e.g. promoted teams) to the model
using their FIFA ratings as covariates.
"""
teams = get_teams_for_season(season, dbsession=dbsession)
teams = get_teams_for_season(season=season, dbsession=dbsession)
for t in teams:
if t not in team_model.teams:
print(f"Adding {t} to team model with covariates")
ratings = get_ratings_dict(season, [t], dbsession)
team_model.add_new_team(t, team_covariates=ratings[t])
if ratings:
print(f"Adding {t} to team model with covariates")
ratings = get_ratings_dict(season, [t], dbsession)
team_model.add_new_team(t, team_covariates=ratings[t])
else:
print(f"Adding {t} to team model without covariates")
team_model.add_new_team(t)
return team_model


def get_fitted_team_model(
season, gameweek, dbsession, team_model_class=ExtendedDixonColesMatchPredictor
):
season: str,
gameweek: int,
dbsession: Session,
ratings: bool = True,
model: Union[
ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor
] = ExtendedDixonColesMatchPredictor(),
**fit_args,
) -> Union[ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor]:
"""
get the fitted team model using the past results and the FIFA rankings
Get the fitted team model using the past results and the FIFA rankings.
"""
print(f"Fitting team model ({type(team_model_class())})...")
training_data = get_training_data(season, gameweek, dbsession)
team_model = create_and_fit_team_model(training_data, team_model_class)
return add_new_teams_to_model(team_model, season, dbsession)
print(f"Fitting team model ({type(model)})...")
training_data = get_training_data(
season=season,
gameweek=gameweek,
dbsession=dbsession,
ratings=ratings,
)
team_model = create_and_fit_team_model(
training_data=training_data, model=model, **fit_args
)
return add_new_teams_to_model(
team_model=team_model, season=season, dbsession=dbsession, ratings=ratings
)


def fixture_probabilities(
gameweek, season=CURRENT_SEASON, team_model=None, dbsession=session
):
gameweek: int,
season: str = CURRENT_SEASON,
model: Optional[
Union[ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor]
] = None,
dbsession: Session = session,
ratings: bool = True,
**fit_args,
) -> pd.DataFrame:
"""
Returns probabilities for all fixtures in a given gameweek and season, as a data
frame with a row for each fixture and columns being home_team,
away_team, home_win_probability, draw_probability, away_win_probability.

If no model is passed, it will fit a ExtendedDixonColesMatchPredictor model
by default.
"""
if team_model is None:
team_model = get_fitted_team_model(season, gameweek, dbsession)

# fit team model if none is passed or if it is not fitted yet
# (model.teams will be None if so)
if model is None:
# fit extended model by default
model = get_fitted_team_model(
season=season,
gameweek=gameweek,
dbsession=dbsession,
ratings=ratings,
model=ExtendedDixonColesMatchPredictor(),
**fit_args,
)
elif model.teams is None:
# model is not fit yet, so will need to fit
model = get_fitted_team_model(
season=season,
gameweek=gameweek,
dbsession=dbsession,
ratings=ratings,
model=model,
**fit_args,
)

# obtain fixtures
fixtures = get_fixture_teams(
get_fixtures_for_gameweek(gameweek, season=season, dbsession=dbsession)
get_fixtures_for_gameweek(gameweek=gameweek, season=season, dbsession=dbsession)
)
home_teams, away_teams = zip(*fixtures)
probabilities = team_model.predict_outcome_proba(home_teams, away_teams)

# obtain match probabilities
if isinstance(model, ExtendedDixonColesMatchPredictor):
probabilities = model.predict_outcome_proba(home_teams, away_teams)
elif isinstance(model, NeutralDixonColesMatchPredictor):
probabilities = model.predict_outcome_proba(
home_teams, away_teams, neutral_venue=np.zeros(len(home_teams))
)
else:
raise NotImplementedError(
"model must be either of type "
"'ExtendedDixonColesMatchPredictor' or "
"'NeutralDixonColesMatchPredictor'"
)
return pd.DataFrame(
{
"home_team": home_teams,
Expand All @@ -138,9 +254,17 @@ def fixture_probabilities(
)


def get_goal_probabilities_for_fixtures(fixtures, team_model, max_goals=10):
"""Get the probability that each team in a fixture scores any number of goals up
to max_goals."""
def get_goal_probabilities_for_fixtures(
fixtures: List[Fixture],
team_model: Union[
ExtendedDixonColesMatchPredictor, NeutralDixonColesMatchPredictor
],
max_goals: int = 10,
) -> Dict[int, Dict[str, Dict[int, float]]]:
"""
Get the probability that each team in a fixture scores any number of goals up
to max_goals.
"""
goals = np.arange(0, max_goals + 1)
probs = {}
for f in fixtures:
Expand Down
1 change: 1 addition & 0 deletions airsenal/framework/data_fetcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Classes to query the FPL API to retrieve current FPL data,
and to query football-data.org to retrieve match and fixture data.
"""

import getpass
import json
import time
Expand Down
1 change: 1 addition & 0 deletions airsenal/framework/env.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Database can be either an sqlite file or a postgress server
"""

import os
from pathlib import Path

Expand Down
1 change: 0 additions & 1 deletion airsenal/framework/mappings.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
map between different conventions used in different data sources.
"""


positions = {1: "GK", 2: "DEF", 3: "MID", 4: "FWD"}

alternative_team_names = {
Expand Down
1 change: 1 addition & 0 deletions airsenal/framework/optimization_transfers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Functions for optimising transfers across multiple gameweeks, including the possibility
of using chips.
"""

import random
from operator import itemgetter

Expand Down
1 change: 1 addition & 0 deletions airsenal/framework/optimization_utils.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
functions to optimize the transfers for N weeks ahead
"""

import warnings
from copy import deepcopy
from datetime import datetime
Expand Down
2 changes: 1 addition & 1 deletion airsenal/framework/player.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
"""

from airsenal.framework.schema import Player
from airsenal.framework.season import CURRENT_SEASON
from airsenal.framework.utils import (
CURRENT_SEASON,
NEXT_GAMEWEEK,
get_player,
get_predicted_points_for_player,
Expand Down
Loading
Loading