-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: opt-in C parser #10942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: opt-in C parser #10942
Conversation
Codecov Reportβ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #10942 +/- ##
==========================================
+ Coverage 90.68% 90.93% +0.24%
==========================================
Files 504 505 +1
Lines 39795 41220 +1425
Branches 3141 3257 +116
==========================================
+ Hits 36087 37482 +1395
- Misses 3042 3095 +53
+ Partials 666 643 -23 β View full report in Codecov by Sentry. π New features to boost your workflow:
|
|
You can get it using
Letβs keep it simple and behind a feature-flag. This feels like a step backward (moving from 1.2 -> 1.1), and Iβm fine explicitly calling it experimental. I donβt expect this to ever become stable, it's just an escape hatch when things are slow. We should document that itβs YAML 1.1 and may have compatibility issues. |
|
Some benchmark numbers on a real dvc.lock file:
I'd love to have a 1.2-compatible, fast parser as an option, and the ~30% speedup over |
f2b12d4 to
ddffe5c
Compare
| with modify_yaml(self.path, fs=self.repo.fs) as data: | ||
| if not data: | ||
| data.update({"schema": "2.0"}) | ||
| # order is important, meta should always be at the top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: restore comments
I realized my comment here was a bit silly, since I doubt the few patches in this PR are enough to make it 1.2-compatible π |
ddffe5c to
c105460
Compare
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π
Overview
Adds a new
fast_yamlopt-in config that uses PyYAML's C parser for code paths that don't need round-trip (e.g. to preserve comments). Makes things significantly faster.Mostly LLM-generated, especially the extra stuff to make the C Parser respect YAML 1.2 π€£ . Could drop that extra stuff, since it's opt-in, switch to a different, fast parser, or just keep it?