Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.28 KB

README.md

File metadata and controls

27 lines (20 loc) · 1.28 KB

MISSION

MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

An ICML 2018 paper by Amirali Aghazadeh*, Ryan Spring*, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk

* These authors contributed equally and are listed alphabetically.

Code Versions

  1. Mission Logistic Regression
  2. Mission Softmax Regression
  3. Feature Hashing Softmax Regression

Optimizations

  • Mission streams in the dataset via Memory-Mapped I/O instead of loading everything directly into memory -
    Necessary for Tera-Scale Datasets
  • AVX SIMD optimization for fast Softmax Regression
  • The code is currently optimized for the Splice-Site and DNA Metagenomics datasets.

Datasets

  1. KDD 2012
  2. RCV1
  3. Webspam - Trigram
  4. DNA Metagenomics
  5. Criteo 1TB
  6. Splice-Site 3.2TB