Hi there, I am Jiacheng Zhang, a Ph.D. candidate at Trustworthy Machine Learning and Reasoning Group (TMLR) in the Faculty of Engineering and Information Technology, the University of Melbourne, advised by Prof. Feng Liu and Prof. Ben Rubinstein. Before that, I obtained my Honours degree from the University of Sydney, feeling fortunate to learn from Prof. Tongliang Liu.
I am passionate about advancing the field of trustworthy machine learning. My research interests lie in improving the Robustness and Safety of AI systems at all levels, including:
I firmly believe that before AI can achieve widespread societal adoption, its foundations must be trustworthy and safe.
Please feel free to email me for research, collaborations, or a casual chat.
Adversarial training (AT) trains models using adversarial examples (AEs), which are natural images modified with specific perturbations to mislead the model. These perturbations are constrained by a predefined perturbation budget ϵ and are equally applied to each pixel within an image. However, in this paper, we discover that not all pixels contribute equally to the accuracy on AEs (i.e., robustness) and accuracy on natural images (i.e., accuracy). Motivated by this finding, we propose Pixel-reweighted AdveRsarial Training (PART), a new framework that partially reduces ϵ for less influential pixels, guiding the model to focus more on key regions that affect its outputs. Specifically, we first use class activation mapping (CAM) methods to identify important pixel regions, then we keep the perturbation budget for these regions while lowering it for the remaining regions when generating AEs. In the end, we use these pixel-reweighted AEs to train a model. PART achieves a notable improvement in accuracy without compromising robustness on CIFAR-10, SVHN and TinyImagenet-200, justifying the necessity to allocate distinct weights to different pixel regions in robust classification.
Powered by Jekyll and Minimal Light theme.