SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning
Paper • 2601.01392 • Published
AntiFraud-SFT is a supervised fine-tuned audio-text fraud detection model built on top of Qwen2-Audio for Chinese telecom fraud analysis.
This model is trained on the TeleAntiFraud-28k dataset and is designed for:
The current release is intended as a research model checkpoint for reproduction and further study.
Qwen/Qwen2-Audio-7B-InstructQwen2AudioForConditionalGenerationsafetensorsThis repository contains the model weights and tokenizer / processor files required for inference with transformers.
Example loading code:
from transformers import AutoProcessor, Qwen2AudioForConditionalGeneration
model_id = "JimmyMa99/AntiFraud-SFT"
processor = AutoProcessor.from_pretrained(model_id)
model = Qwen2AudioForConditionalGeneration.from_pretrained(
model_id,
device_map="auto",
)
evaluation/ directory in the TeleAntiFraud repository.@inproceedings{ma2025teleantifraud,
title={TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection},
author={Ma, Zhiming and Wang, Peidong and Huang, Minhua and Wang, Jinpeng and Wu, Kai and Lv, Xiangzhao and Pang, Yachun and Yang, Yin and Tang, Wenjie and Kang, Yuchen},
booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
pages={5853--5862},
year={2025}
}
@article{wang2026safe,
title={SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning},
author={Wang, Peidong and Ma, Zhiming and Dai, Xin and Liu, Yongkang and Feng, Shi and Yang, Xiaocui and Hu, Wenxing and Wang, Zhihao and Pan, Mingjun and Yuan, Li and others},
journal={arXiv preprint arXiv:2601.01392},
year={2026}
}