Skip to content
/ seqrep Public

Scientific framework for representation in sequential data

License

Notifications You must be signed in to change notification settings

MIR-MU/seqrep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SeqRep

PyPI version Open Source? Yes! GitHub license Check Markdown links CodeFactor Code style: black Imports: isort Open in Colab Towards Data Science

Scientific framework for representation in sequential data

Table of Content

Click to expand!

Description

This package aims to simplify the workflow of evaluation of machine learning models. It is primarily focused on sequential data. It helps with:

  • labeling data,
  • splitting data,
  • feature extraction,
  • feature reduction (i.e. selection or transformation),
  • running pipeline,
  • evaluation of results.

It also allows you to visualize each step.

The framework is designed for easy customization and extension of its functionality.

Installation

python -m pip install git+https://github.com/MIR-MU/seqrep

Features

See the README in the seqrep folder.

Usage

It is simple to use this package. After the import, you need to do three steps:

  1. Create your pipeline (which you want to evaluate);
  2. Create PipelineEvaluator (according to how you want to evaluate);
  3. Run the evaluation.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC

from seqrep.feature_engineering import PreviousValuesExtractor, TimeFeaturesExtractor
from seqrep.labeling import NextColorLabeler
from seqrep.splitting import TrainTestSplitter
from seqrep.scaling import UniversalScaler
from seqrep.evaluation import ClassificationEvaluator
from seqrep.pipeline_evaluation import PipelineEvaluator

# 1. step
pipe = Pipeline([('fext_prev', PreviousValuesExtractor()),
                 ('fext_time', TimeFeaturesExtractor()),
                 ('scale_u', UniversalScaler(scaler=MinMaxScaler())),
                 ])

# 2. step
pipe_eval = PipelineEvaluator(labeler = NextColorLabeler(),
                              splitter = TrainTestSplitter(),
                              pipeline = pipe,
                              model = SVC(),
                              evaluator = ClassificationEvaluator(),
                              )
# 3. step
result = pipe_eval.run(data=data)

See the examples folder for more details.

License

GitHub license

This package is licensed under the MIT license, so it is open source. Feel free to use it!

Acknowledgement

Thanks for the huge support to my supervisor Michal Stefanik! Gratitude also belongs to all members of the MIR-MU group. Finally, thanks go to the Faculty of Informatics of Masaryk University for supporting this project as a dean's project.

About

Scientific framework for representation in sequential data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages