Zijun Wang

I am a PhD student at University of California, Santa Cruz (UCSC), and my advisor is Prof. Cihang Xie. I received my BS from Zhejiang University.

My research interests mainly revolve around AI Safety.

I am always open to research discussions and collaborations : )

Email  /  Google Scholar  /  Github  /  CV  /  Twitter

profile photo

News

[Jan. 2025] One paper is accepted by TMLR 2025

[Dec. 2024] One paper is accepted by KDD 2025

[Jul. 2024] One paper is accepted by ECCV 2024

[Dec. 2023] Second Place in both base & large model subtracks of Red Teaming LLM@NeurIPS 2023, Torjan Detection Challenge

Publications

STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
Zijun Wang, Haoqin Tu, Yuhan Wang, Juncheng Wu, Jieru Mei, Brian R. Bartoldson, Bhavya Kailkhura, Cihang Xie,
Technique Report, 2025


AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Zijun Wang, Haoqin Tu, Jieru Mei, Bingchen Zhao, Yisen Wang, Cihang Xie,
TMLR, 2025


How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Haoqin Tu*, Chenhang Cui*, Zijun Wang*, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie (* represents equal contribution)
ECCV, 2024


Handling Feature Heterogeneity with Learnable Graph Patches
Yifei Sun, Yang Yang, Xiao Feng, Zijun Wang, Haoyang Zhong, Chunping Wang, Lei Chen
KDD, 2025


Experience

Sep. 2024 - Present, VLAA Lab, UC Santa Cruz

PhD student advised by Prof. Cihang Xie, AI Safety

Aug. 2023 - Aug. 2024, VLAA Lab, UC Santa Cruz

Visiting Research Intern advised by Prof. Cihang Xie, Adversarial Attacks on LLMs & VLLMs

Jan. 2023 - Jul. 2023, Zhejiang University

Research Assistant advised by Prof. Yang Yang, Genaralized Graph Pre-training

Sep. 2020 - Jun. 2024, Zhejiang University

Undergrad, GPA: 3.92/4.0

Awards

- National Scholarship issued by Ministry of Education of the People's Republic of China

- First-class Scholarship of Zhejiang University

- Provincial Government Scholarship of Zhejiang Province


Last Update 2025.01.24. Thanks to Jon Barron.