Skip to content

Conversation

@burtenshaw
Copy link
Collaborator

This PR builds on the Claude Code post by using codex in a slightly more complex way.

Copy link
Contributor

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool!

```
Start a new fine-tuning experiment to improve code solving abilities on using SFT.
- Maintain a report for the experiment.
- Evaluate models the openai_humaneval benchmark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool that you can tell it's a benchmark without giving dataset id directly

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! 🔥

Codex runs a quick inspection on CPU (fractions of a penny) and reports:

```
Dataset validation for my-org/conversation-data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be open-r1/codeforces-cots?

burtenshaw and others added 2 commits December 10, 2025 15:34
@burtenshaw
Copy link
Collaborator Author

@merveenoyan @pcuenca thanks for the amazing feedback! I've replied to all the comments and would like to merge today. 🙏

Copy link
Contributor

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with a single comment🙌🏼

@burtenshaw burtenshaw merged commit b5e4f48 into main Dec 11, 2025
1 check passed
@burtenshaw burtenshaw deleted the hf-skills-codex branch December 11, 2025 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants