-
Notifications
You must be signed in to change notification settings - Fork 262
Description
Describe the bug
I tried to convert and quartinize qwen3:1.7B for qualcomm npu. Got this error:
Error Unrecognized attribute: qk_output for operator GroupQueryAttention
To Reproduce
- Activate the environment created by vscode ai-toolkit Python-NvidiaGPU-win32-x64-3.12.9
- Run below command:
olive optimize -m Qwen/Qwen3-1.7B --provider QNNExecutionProvider --device npu --precision int4 --num_split 4 --enable_aot --qnn_env_path ..\Python-QNN-win32-x64-3.12.9\Scripts --surgeries RemoveRopeMultiCache,AttentionMaskToSequenceLengths,SimplifiedLayerNormToL2Norm --act_precision uint16 --use_qdq_format --log_level 1
Expected behavior
model could be converted successfully.
Olive config
My olive command as below:
olive optimize -m Qwen/Qwen3-1.7B --provider QNNExecutionProvider --device npu --precision int4 --num_split 4 --enable_aot --qnn_env_path ..\Python-QNN-win32-x64-3.12.9\Scripts --surgeries RemoveRopeMultiCache,AttentionMaskToSequenceLengths,SimplifiedLayerNormToL2Norm --act_precision uint16 --use_qdq_format --log_level 1
And here is my environment:
Using Python 3.12.9 environment at: .
Package Version
accelerate 1.10.1
aiohappyeyeballs 2.6.1
aiohttp 3.11.16
aiosignal 1.4.0
alembic 1.16.5
annotated-types 0.7.0
attrs 25.3.0
auto-gptq 0.8.0.dev0+cu128
certifi 2025.8.3
charset-normalizer 3.4.3
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.9.0
datasets 3.5.0
dill 0.3.8
filelock 3.18.0
flatbuffers 25.2.10
frozenlist 1.7.0
fsspec 2024.12.0
gekko 1.3.0
greenlet 3.2.4
hf-xet 1.1.9
huggingface-hub 0.34.4
humanfriendly 10.0
idna 3.10
jinja2 3.1.6
lightning-utilities 0.15.2
mako 1.3.10
markupsafe 3.0.2
ml-dtypes 0.5.3
mpmath 1.3.0
multidict 6.6.4
multiprocess 0.70.16
networkx 3.4.2
numpy 2.2.4
olive-ai 0.9.3
onnx 1.17.0
onnx-ir 0.1.5
onnxruntime-genai-cuda 0.9.0
onnxruntime-gpu 1.23.2
onnxscript 0.3.2
optimum 1.26.0
optuna 4.3.0
packaging 24.2
pandas 2.2.3
peft 0.17.1
pillow 11.2.1
propcache 0.3.2
protobuf 3.20.3
psutil 7.1.0
pyarrow 19.0.1
pydantic 2.11.3
pydantic-core 2.33.1
pyreadline3 3.5.4
python-dateutil 2.9.0.post0
pytz 2025.2
pyyaml 6.0.2
regex 2025.9.1
requests 2.32.3
rouge 1.0.1
safetensors 0.6.2
sentencepiece 0.2.1
setuptools 80.9.0
six 1.17.0
sqlalchemy 2.0.43
sympy 1.13.3
tabulate 0.9.0
threadpoolctl 3.6.0
tokenizers 0.21.4
torch 2.7.0+cu128
torchmetrics 1.7.1
torchvision 0.22.0+cu128
tqdm 4.67.1
transformers 4.51.3
typing-extensions 4.15.0
typing-inspection 0.4.1
tzdata 2025.2
urllib3 2.5.0
xxhash 3.5.0
yarl 1.20.1