Results of experiments with entropy regularization, using a toy CNN classifier for pixelated MNIST images.
Entropy regularization for a multi-label image classification problem is given by,
where
The context of this is using such models as a source of rewards for Reinforcement Learning. The original application of this was to fine-tune CLIP models so that they have less noise and their semantics entropy reward trajectories are smoother. The expectation is that this would lead to denser rewards, decreased semantic bias in random images (misclassifications) / improved specificity, and possibly reduced class preference in CLIP's outputs.
This repository presents the results of using this entropy regularization in tandem with an augmented training dataset (using random images that should ideally be classified with equal probability among the labels) to train an MNIST classifier.
The results can be visualized here.
Click here to see the model summary and comparisons.
For architecture details of flatnet, please see the code.