Add TQ2_0 quantization support to whisper.cpp #2
+6
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
GGML supports TQ2_0 quantization but the whisper.cpp quantize tool and model loader did not expose it.
Changes
ggml/include/ggml.h
GGML_FTYPE_MOSTLY_TQ2_0 = 26enum valueexamples/common-ggml.cpp
"tq2_0"string toGGML_FTYPE_MOSTLY_TQ2_0't'prefix inggml_parse_ftype()(alongside existing'q')GGML_FTYPE_MOSTLY_TQ2_0→GGML_TYPE_TQ2_0mappingGGML_TYPE_TQ2_0to supported quantization typesggml/src/ggml.c
GGML_FTYPE_MOSTLY_TQ2_0case toggml_ftype_to_ggml_type()Usage
./build/bin/quantize model-f32.bin model-tq2_0.bin tq2_0 # or ./build/bin/quantize model-f32.bin model-tq2_0.bin 26Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.