self.serialize() => to_dict every where + json
cumulative (integrated) tranformationOK
ratio_E = 0.5N_train = N_Estim + N_Valid + N_test (=H)- ~~N_Estim = ratio_E * (N_train - H) ~~
- ~~improve plotting ... shaded area around prediction intervals neeed prediction intervals first
- http://robjhyndman.com/hyndsight/tscvexample/
- book : https://www.otexts.org/fpp
- https://www.otexts.org/fpp/2/5
https://www.otexts.org/fpp/2/6
- Exponential smoothing
exogenous variablesOK.moving average(N). OKmoving median(N). OK
1. seasonal
2. user holidays etc (external tables?)
ARXOK.- VAR ?
- order control (look at timedelta ??).
activate/disable transfromations/models/decomposition. OKconfigure trendsOK.configure cycles (CycleLength = ?). OK- cycle length should be in [5, 7, 12, 24 , 30, 60]
configure ARs (p = ?)OK.processing : threads etcOK.
MCompOK.NN5OK.NN3OK.Yahoo stocksOK.
- python is sloooooooooooow (cython ?)
- multiprocessing seems OK
allow user control. OK.truncate timedelta to the nearest unit.OK.- avoid saturday/sunday if not present in the dataset.
=> http://robjhyndman.com/talks/MelbourneRUG.pdf
===> smaller model => smaller SQL code !!!
http://stackoverflow.com/questions/10302261/forecasting-time-series-data ~~~~ OK
https://stanford.edu/~mwaskom/software/seaborn/. Let someone else do that !!!
autoregressive benchmark cycles data-frame exogenous forecasting heroku hierarchical horizon jupyter machine-learning-library pandas restful-api scikit-learn seasonal sql sql-generation time-series trends
http://eem2017.com/program/forecast-competition
In cooperation with our technical sponsor, we will provide you with a set of different weather input factors, e.g. wind direction, with which you are to forecast the power generation of a wind power plant portfolio. You may participate individually or as a team. The data input is organised in a realistic setting.
- Prediction Interval Quality
Multiplicative Decompositions (log transform ?). 2022-05-09 OK (#178 : #178)PyTorch: 2022-05-09 OK (#199 : #199)
- Outliers detection. https://otexts.com/fpp2/missing-outliers.html
- In ARX Models, category 1 : data quality / wrong input. Can be removed. https://otexts.com/fpp2/regression-evaluation.html
- In ARX Models, category 2 : Natural, Simply different data. Should not be removed. https://otexts.com/fpp2/regression-evaluation.html
- X11 decomposition : The process is entirely automatic and tends to be highly robust to outliers and level shifts in the time series. https://otexts.com/fpp2/x11.html
- STL decomposition : outliers may affect the remainder component. https://otexts.com/fpp2/stl.html
- Referenced in #230
- Outliers removal : estimate the trends/cycles/AR models with an estimation dataset that does not contain the outliers. Can be capped without removal ?
- Outliers Reporting. Scatter Plots ? https://otexts.com/fpp2/scatterplots.html
- Generic view (not only for time series): https://en.wikipedia.org/wiki/Outlier
- Detection method : Tukey's fences, based on measures such as the interquartile range. range = [Q1 - k * (Q3 - Q1) , Q3 + k * (Q3 - Q1)], k > 0. Simple, non-parametric, robust.
- Flags/forecast outputs : Tukey uses k = 1.5 to flag as "outlier" and k=3 to flag as "far out".
- k value can be used as a training option to remove more-or-less outliers. Default : 3 ?
- Interquartile range (IQR = Q3 - Q1): middle 50%. https://en.wikipedia.org/wiki/Interquartile_range
- IQR is a Robust measure of scale : for N(0, sigma) , IQR = 1.349 sigma. https://en.wikipedia.org/wiki/Robust_measures_of_scale
- Add more candidates with and without oultiers (ARX and ARXO)? Not sure. KISS.
- tsoutliers : The tsoutliers() function in the forecast package for R is useful for identifying anomalies in a time series. Excellent !!! http://cran.r-project.org/web/packages/tsoutliers/
- tsoutliers : This package implements a procedure based on the approach described in Chen and Liu (1993) for automatic detection of outliers in time series. Innovational outliers, additive outliers, level shifts, temporary changes and seasonal level shifts are considered
- Joint Estimation of Model Parameters and Outlier Effects in Time Series. Chung Chen & Lon-Mu Liu. Journal of the American Statistical Association Volume 88, 1993 - Issue 421
- http://cran.r-project.org/web/packages/tsoutliers/tsoutliers.pdf
- tsoutliers hicp dataset : Harmonised indices of consumer prices in the Euro area.
- tsoutliers ipi dataset : Industrial production indices in the manufacturing sector of European Monetary Union countries. 24.tsoutliers bde9915 dataset : seasonal outliers ? Kaiser, R., and Maravall, A. (1999). Seasonal Outliers in Time Series. Banco de España, Servicio de Estudios. Working paper number 9915.
- forecast package (tsoutliers function, clean.R): Custom processing for outliers. Iterative MSTL + Smoothing trends + IQR + linear interpolation. https://robjhyndman.com/hyndsight/tsoutliers
- fpp2 Gold price dataset : The gold price data contains daily morning gold prices in US dollars from 1 January 1985 to 31 March 1989.
Add the possibility to customize the performance measures over the given horzion.
One may want to optimize weekly forecasts on a daily model (not take into account small changes within the same week and take forecast aggregates on 7 days).
https://arxiv.org/pdf/2310.10688 A DECODER-ONLY FOUNDATION MODEL FOR TIME-SERIES FORECASTING