Membership inference attacks can be carried out against neural networks to infer the presence of data records in the training dataset, breaching user privacy. Black-box membership inference attacks can be executed simply by analyzing metrics like entropy, standard deviation, and maximum posterior probability of output prediction vectors, as these metrics have different distributions for members and non-members of the training dataset.
The Metric Mapping algorithm aims to eliminate the differences in metric distributions of members and non-members by manipulating the output prediction vectors such that the prediction remains the same. Thus, ensuring that utility is not compromised to preserve privacy.
The Jupyter notebooks consist of experiments measuring the effectiveness of metric mapping in mitigating black-box membership inference attacks for different datasets in the presence of class imbalance and overfitting. Furthermore, the algorithm is compared with DP-SGD, DP-Logits, and Dropout techniques with respect to preserving privacy and maintaining utility.