-
Notifications
You must be signed in to change notification settings - Fork 955
hf skills post with codex training llms #3224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
merveenoyan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool!
hf-skills-training-codex.md
Outdated
| ``` | ||
| Start a new fine-tuning experiment to improve code solving abilities on using SFT. | ||
| - Maintain a report for the experiment. | ||
| - Evaluate models the openai_humaneval benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool that you can tell it's a benchmark without giving dataset id directly
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! 🔥
| Codex runs a quick inspection on CPU (fractions of a penny) and reports: | ||
|
|
||
| ``` | ||
| Dataset validation for my-org/conversation-data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be open-r1/codeforces-cots?
Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Merve Noyan <[email protected]>
|
@merveenoyan @pcuenca thanks for the amazing feedback! I've replied to all the comments and would like to merge today. 🙏 |
merveenoyan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me with a single comment🙌🏼
This PR builds on the Claude Code post by using codex in a slightly more complex way.