Skip to content

Conversation

@tscholak
Copy link
Collaborator

Summary

  • Add README.md documenting the algebraic structure of the conversion system (surgery monoid, action law, plan composition, total vs partial operations)
  • Add example surgery files for pruning a homogeneous supernet to a heterogeneous network

Test plan

  • Verified two-step pruning workflow with ServiceNow-AI/apriel2-0.5b-dev
  • Confirmed weight transfer correctness for all 4 mixer types (attention, gdn, kda, sliding_window)

🤖 Generated with Claude Code

- Add README.md documenting the algebraic structure of the conversion system
  (surgery monoid, action law, plan composition, total vs partial operations)
- Add prune_supernet_step1.yaml and prune_supernet_step2.yaml examples
  demonstrating the two-step workflow for pruning a homogeneous supernet
  to a heterogeneous network with different mixer types per layer

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive documentation and practical examples for the Apriel2 conversion system, focusing on the algebraic structure of the surgery/conversion framework and demonstrating a two-step workflow for pruning supernets.

Changes:

  • Added a detailed README documenting the algebraic structure of the conversion system (monoid action, plan composition, total vs partial operations)
  • Added two example YAML files demonstrating supernet pruning through a two-step surgery workflow

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
fast_llm_external_models/apriel2/conversion/README.md Comprehensive documentation of the algebraic structure, including surgery monoid, action laws, plan composition, functoriality, and practical examples for supernet creation and pruning
fast_llm_external_models/apriel2/examples/prune_supernet_step1.yaml First step of supernet pruning: converts fixed to pattern decoder and sets main_mixer_name per block type
fast_llm_external_models/apriel2/examples/prune_supernet_step2.yaml Second step of supernet pruning: unwraps stochastic mixers to non-stochastic types using weights from the designated sub-mixers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants