Clean Test Accuracy and Adversarial Training via Average Attack

cover
30 Sept 2024

Authors:

(1) Seokil Ham, KAIST;

(2) Jungwuk Park, KAIST;

(3) Dong-Jun Han, Purdue University;

(4) Jaekyun Moon, KAIST.

Abstract and 1. Introduction

2. Related Works

3. Proposed NEO-KD Algorithm and 3.1 Problem Setup: Adversarial Training in Multi-Exit Networks

3.2 Algorithm Description

4. Experiments and 4.1 Experimental Setup

4.2. Main Experimental Results

4.3. Ablation Studies and Discussions

5. Conclusion, Acknowledgement and References

A. Experiment Details

B. Clean Test Accuracy and C. Adversarial Training via Average Attack

D. Hyperparameter Tuning

E. Discussions on Performance Degradation at Later Exits

F. Comparison with Recent Defense Methods for Single-Exit Networks

G. Comparison with SKD and ARD and H. Implementations of Stronger Attacker Algorithms

B Clean Test Accuracy

Table A1 shows the clean accuracy results using the model built upon adversarial training via maxaverage attack. We observe that NEO-KD generally shows comparable clean test accuracy with Adv. w/o Distill [12], especially on the more complicated dataset Tiny-ImageNet [17] while achieving much better adversarial test accuracy as reported in the main manuscript.

C Adversarial Training via Average Attack

In the main manuscript, we presented experimental results using the model trained based on maxaverage attack. Here, we also adversarially train the model via average attack [12] and measure adversarial test accuracy on CIFAR-100 dataset. Table A2 compares adversarial test accuracies of NEO-KD and other baselines against max-average attack and average attack. The overall results are consistent with the ones in the main manuscript with adversarial training via max-average attack, further confirming the advantage of NEO-KD.

This paper is available on arxiv under CC 4.0 license.