AttnRL

🔔 News

[2025-10-21] 📢 Our work is reported by Synced (机器之心)!
[2025-10-10] ✨ Code is now available.
[2025-09-30] 📄 Our paper is released on arXiv.

🚀 Getting Started

Installation

Clone the repository:

git clone https://github.com/RyanLiu112/AttnRL.git
cd AttnRL

Create a new conda environment and install the dependencies:

conda create -n attnrl python=3.10
conda activate attnrl
bash scripts/install_vllm_sglang_mcore.sh

Data Preparation

The training dataset (DeepScaleR-Preview-Dataset) is at data/train/deepscaler_train.parquet, which contains 40.3k mathematical reasoning data. The evaluation datasets are in data/eval/ and the suffix _${K} indicates the number of duplicate samples for each question.

Training

For training AttnRL with DeepSeek-R1-Distill-Qwen-1.5B backbone on 8 H100 GPUs, run:

bash recipe/attnrl/run_attnrl_r1_distill_1.5b_8k.sh

Evaluation

Evaluation scripts are the same as the training scripts. +trainer.val_only=True should be added to perform evaluation only. We recommend setting data.max_prompt_length=2048 and data.max_response_length=32768.

📝 Citation

If you find this work helpful, please kindly cite our paper:

@article{AttnRL,
    title   = {Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models},
    author  = {Liu, Runze and Wang, Jiakang and Shi, Yuling and Xie, Zhihui and An, Chenxin and Zhang, Kaiyan and Zhao, Jian and Gu, Xiaodong and Lin, Lei and Hu, Wenping and Li, Xiu and Zhang, Fuzheng and Zhou, Guorui and Gai, Kun},
    journal = {arXiv preprint arXiv:2509.26628},
    year    = {2025}
}

💡 Acknowledgements

Our code is based on verl (commit) and TreeRL. Our training dataset is from DeepScaleR-Preview-Dataset and rule-based verifier is based on Skywork-OR1, and Archer.

Name		Name	Last commit message	Last commit date
Latest commit History 861 Commits
.vscode		.vscode
data		data
docker		docker
docs		docs
examples		examples
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AttnRL

🔔 News

🚀 Getting Started

Installation

Data Preparation

Training

Evaluation

📝 Citation

💡 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 382

Uh oh!

Languages

License

RyanLiu112/AttnRL

Folders and files

Latest commit

History

Repository files navigation

AttnRL

🔔 News

🚀 Getting Started

Installation

Data Preparation

Training

Evaluation

📝 Citation

💡 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 382

Uh oh!

Languages

Packages