Skip to content

Latest commit

 

History

History
507 lines (413 loc) · 27.8 KB

datasets_models.md

File metadata and controls

507 lines (413 loc) · 27.8 KB

Datasets & Models

FELES provides a set of ready-to-use datasets and models for bootstrapping FL algorithms implementation and comparison.

The datasets and models are taken from well known sources and provided by TensorFlow Datasets.

The available datasets are:

name task reference
mnist image classification MNIST
fashion_mnist image classification Fashion MNIST
cifar10 image classification CIFAR10
cifar100 image classification CIFAR100
imdb_reviews text classification, sentiment IMDB Reviews
boston_housing regression Boston Housing
emnist image classification EMNIST
sentiment140 text classification, sentiment Sentiment140
shakespeare text generation (char level) Shakespeare
wisdm activity recognition WISDM
oxford_iiit_pet:3.*.* image classification Oxford Pets
tff_cifar100 image classification TFF_CIFAR100
tff_emnist image classification TFF_EMNIST
tff_shakespeare text generation TFF_SHAKESPEARE

MNIST

  • name: mnist
  • description: the MNIST dataset of handwritten digits has a training set of 60,000 examples,and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image
  • url: http://yann.lecun.com/exdb/mnist/
  • source: TensorFlow Datasets
  • IID: yes
  • task: image classification
  • visualization: Know Your Data
  • model: neural network from TensorFlow
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________

Fashion MNIST

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_1 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 128)               100480    
_________________________________________________________________
dense_3 (Dense)              (None, 10)                1290      
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________

CIFAR10

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
_________________________________________________________________
flatten_2 (Flatten)          (None, 1024)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                65600     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________

CIFAR100

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_6 (Conv2D)           (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 15, 15, 32)       0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 6, 6, 64)         0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 4, 4, 64)          36928     
                                                                 
 flatten_5 (Flatten)         (None, 1024)              0         
                                                                 
 dense_20 (Dense)            (None, 64)                65600     
                                                                 
 dense_21 (Dense)            (None, 100)               6500      
                                                                 
=================================================================
Total params: 128,420
Trainable params: 128,420
Non-trainable params: 0
_________________________________________________________________

IMDB Reviews

  • name: imdb_reviews
  • description: Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
  • url: http://ai.stanford.edu/%7Eamaas/data/sentiment/
  • source: TensorFlow Datasets
  • IID: yes
  • task: text classification, sentiment
  • model: neural network from Builtin
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_6 (Dense)              (None, 50)                500050    
_________________________________________________________________
dropout_1 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_7 (Dense)              (None, 50)                2550      
_________________________________________________________________
dropout_2 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_8 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_9 (Dense)              (None, 1)                 51        
=================================================================
Total params: 505,201
Trainable params: 505,201
Non-trainable params: 0
_________________________________________________________________

Boston Housing

  • name: boston_housing
  • description: this dataset is taken from the StatLib library which is maintained at Carnegie Mellon University. Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Targets are the median values of the houses at a location (in k$).
  • url: http://lib.stat.cmu.edu/datasets/boston
  • source: TensorFlow Datasets
  • IID: yes
  • task: regression
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_15 (Dense)            (None, 64)                896       
                                                                 
 dense_16 (Dense)            (None, 64)                4160      
                                                                 
 dense_17 (Dense)            (None, 1)                 65        
                                                                 
=================================================================
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________

EMNIST

_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
 flatten_3 (Flatten)         (None, 784)               0         
                                                                 
 dense_6 (Dense)             (None, 128)               100480    
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 62)                7998      
                                                                 
=================================================================
Total params: 108,478
Trainable params: 108,478
Non-trainable params: 0
_________________________________________________________________

Sentiment140

  • name: sentiment140
  • description: Sentiment140 allows you to discover the sentiment of a brand, product, or topic on Twitter. The data is a CSV with emoticons removed. Data file format has 6 fields: 0 - the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive) 1 - the id of the tweet (2087) 2 - the date of the tweet (Sat May 16 23:58:44 UTC 2009) 3 - the query (lyx). If there is no query, then this value is NO_QUERY. 4 - the user that tweeted (robotickilldozr) 5 - the text of the tweet (Lyx is cool)
  • url: http://help.sentiment140.com/home
  • source: Standford Datasets
  • IID: yes
  • task: text classification, sentiment
  • model: neural network from Builtin
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_24 (Dense)            (None, 50)                500050    
                                                                 
 dropout_4 (Dropout)         (None, 50)                0         
                                                                 
 dense_25 (Dense)            (None, 50)                2550      
                                                                 
 dropout_5 (Dropout)         (None, 50)                0         
                                                                 
 dense_26 (Dense)            (None, 50)                2550      
                                                                 
 dense_27 (Dense)            (None, 1)                 51        
                                                                 
=================================================================
Total params: 505,201
Trainable params: 505,201
Non-trainable params: 0
_________________________________________________________________

Shakespeare

_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, 65)]        0         
                                                                 
 lstm (LSTM)                 [(None, None, 128),       99328     
                              (None, 128),                       
                              (None, 128)]                       
                                                                 
 lstm_1 (LSTM)               [(None, None, 128),       131584    
                              (None, 128),                       
                              (None, 128)]                       
                                                                 
 dense (Dense)               (None, None, 65)          8385      
                                                                 
=================================================================
Total params: 239,297
Trainable params: 239,297
Non-trainable params: 0
_________________________________________________________________

WISDM

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_17 (Conv2D)          (None, 79, 2, 16)         80        
                                                                 
 dropout_6 (Dropout)         (None, 79, 2, 16)         0         
                                                                 
 conv2d_18 (Conv2D)          (None, 78, 1, 32)         2080      
                                                                 
 dropout_7 (Dropout)         (None, 78, 1, 32)         0         
                                                                 
 flatten_7 (Flatten)         (None, 2496)              0         
                                                                 
 dense_28 (Dense)            (None, 64)                159808    
                                                                 
 dropout_8 (Dropout)         (None, 64)                0         
                                                                 
 dense_29 (Dense)            (None, 6)                 390       
                                                                 
=================================================================
Total params: 162,358
Trainable params: 162,358
Non-trainable params: 0
_________________________________________________________________

Oxford Pets

  • name: oxford_iiit_pet:3.*.*
  • description: The Oxford-IIIT pet dataset is a 37 category pet image dataset with roughly 200 images for each class. The images have large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed.
  • url: http://www.robots.ox.ac.uk/~vgg/data/pets/
  • source: TensorFlow Datasets
  • IID: yes
  • task: image segmentation
_________________________________________________________________
   Layer (type)                Output Shape              Param #   
=================================================================
 vgg16 (Functional)          (None, 4, 4, 512)         14714688  
                                                                 
 up_sampling2d (UpSampling2D  (None, 8, 8, 512)        0         
 )                                                               
                                                                 
 conv2d_11 (Conv2D)          (None, 8, 8, 256)         1179904   
                                                                 
 re_lu (ReLU)                (None, 8, 8, 256)         0         
                                                                 
 up_sampling2d_1 (UpSampling  (None, 16, 16, 256)      0         
 2D)                                                             
                                                                 
 conv2d_12 (Conv2D)          (None, 16, 16, 128)       295040    
                                                                 
 re_lu_1 (ReLU)              (None, 16, 16, 128)       0         
                                                                 
 up_sampling2d_2 (UpSampling  (None, 32, 32, 128)      0         
 2D)                                                             
                                                                 
 conv2d_13 (Conv2D)          (None, 32, 32, 64)        73792     
                                                                 
 re_lu_2 (ReLU)              (None, 32, 32, 64)        0         
                                                                 
 up_sampling2d_3 (UpSampling  (None, 64, 64, 64)       0         
 2D)                                                             
                                                                 
 conv2d_14 (Conv2D)          (None, 64, 64, 32)        18464     
                                                                 
 re_lu_3 (ReLU)              (None, 64, 64, 32)        0         
                                                                 
 up_sampling2d_4 (UpSampling  (None, 128, 128, 32)     0         
 2D)                                                             
                                                                 
 conv2d_15 (Conv2D)          (None, 128, 128, 16)      4624      
                                                                 
 re_lu_4 (ReLU)              (None, 128, 128, 16)      0         
                                                                 
 conv2d_16 (Conv2D)          (None, 128, 128, 21)      357       
                                                                 
=================================================================
Total params: 16,286,869
Trainable params: 1,572,181
Non-trainable params: 14,714,688
_________________________________________________________________

TFF_CIFAR100

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_6 (Conv2D)           (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 15, 15, 32)       0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 6, 6, 64)         0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 4, 4, 64)          36928     
                                                                 
 flatten_5 (Flatten)         (None, 1024)              0         
                                                                 
 dense_20 (Dense)            (None, 64)                65600     
                                                                 
 dense_21 (Dense)            (None, 100)               6500      
                                                                 
=================================================================
Total params: 128,420
Trainable params: 128,420
Non-trainable params: 0
_________________________________________________________________

TFF_EMNIST

_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
 flatten_3 (Flatten)         (None, 784)               0         
                                                                 
 dense_6 (Dense)             (None, 128)               100480    
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 62)                7998      
                                                                 
=================================================================
Total params: 108,478
Trainable params: 108,478
Non-trainable params: 0
_________________________________________________________________

TFF_SHAKESPEARE

  • name: tff_shakespeare
    • description: a federated version of the Shakespeare dataset. The data set consists of 715 users (characters of Shakespeare plays), where each example corresponds to a contiguous set of lines spoken by the character in a given play. The dataste is composed of 16,068 train examples and 2,356 test examples.
  • url: https://github.com/TalwalkarLab/leaf
  • source: Tensorflow Dataset
  • IID: no
  • task: text generation
  • model: neural network from Tensorflow
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       multiple                  22016     
                                                                 
 gru (GRU)                   multiple                  394752    
                                                                 
 dense (Dense)               multiple                  22102     
                                                                 
=================================================================
Total params: 438,870
Trainable params: 438,870
Non-trainable params: 0
_________________________________________________________________