Xingyi Zhao

PhD Student, Utah State University • NLP & AI Security

I am a last year Ph.D student in Computer Science at Utah State University, advised by Dr. Shuhan Yuan. I received both my B.Eng. and M.Eng. from the School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST). My research focuses on LLMs security and safe post-training, with interests in faithful reasoning, red-teaming, and alignment-oriented training methods (e.g., SFT and RLHF). My recent works mainly focus on exploring how malicious behaviors emerge and persist in large language models, and I develop practical methods for malicious alignment analysis, harmful information unlearning, and safety post-training while preserving model utility.

Xingyi Zhao
Selected Publications
ICLR Paper thumbnail: Don't Shift the Trigger: Robust Gradient Ascent for Backdoor Unlearning
Don't Shift the Trigger: Robust Gradient Ascent for Backdoor Unlearning
Xingyi Zhao, Tian Xie, Xiaojun Qi, Depeng Xu, Shuhan Yuan
In Proceedings of the 14th International Conference on Learning Representation (ICLR), 2026
ICML Paper thumbnail: Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Xingyi Zhao, Depeng Xu, Shuhan Yuan
In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
EMNLP Paper thumbnail: Generating Textual Adversaries with Minimal Perturbation
Generating Textual Adversaries with Minimal Perturbation
Xingyi Zhao, Lu Zhang, Depeng Xu, Shuhan Yuan
In Findings of the Association for Computational Linguistics: EMNLP 2022.