-
Notifications
You must be signed in to change notification settings - Fork 810
Enhances multi-document YAML parsing error by triggering specific error (#15117) #18652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| } | ||
|
|
||
| // If single document or empty, proceed with normal deserialization | ||
| return new Serializer().Deserialize(fileContent) is { } deserialized ? JToken.FromObject(deserialized) : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the types are compatible, but could you use yamlStream.Documents.SingleOrDefault() here instead to avoid serializing twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could reuse the YamlStream document to avoid a second parse. I kept the serializer intentionally since the stream exposes AST node types, and JToken.FromObject relies on the POCO shape from Serializer.Deserialize()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if this is more than you'd like to take on, but it'd probably be cheaper overall (in terms of computation, not in terms of developer effort) to add a helper method that transforms a SharpYaml YamlDocument into a Newtonsoft JOject (assuming the two are roughly compatible, which they should be since yml is ~isomorphic to JSON).
The end result here is a JToken, and going from yml to POCOs to JObjects is already doing twice as much work as we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can give this a try. Just to confirm: switching to a YamlDocument-to-JToken conversion means we’d stop using Serializer.Deserialize(...) entirely and skip the POCO step. The new path would be: parse YAML into a YamlDocument, walk it, and build a JToken directly.
Are we okay removing the existing Serializer.Deserialize path and moving fully to this new approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve pushed the multi-document YAML detection and AST → JToken conversion changes
to this PR (commit 788636e).
This removes Serializer.Deserialize entirely and adds explicit BCP445 for multi-document YAML.
| yamlStream.Load(reader); | ||
| } | ||
|
|
||
| if (yamlStream.Documents.Count > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why not. loadJsonContent doesn't require that the target file be an object.
I'd say that it's kinda weird that transforming:
abc: defto
abc: def
---
ghi: jklwould change the output of loading the file from an object to an array, but that's just because yaml is weird.
| { | ||
| return new Serializer().Deserialize(fileContent) is { } deserialized ? JToken.FromObject(deserialized) : null; | ||
| // Check for multi-document YAML using YamlStream | ||
| var yamlStream = new YamlStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created a PR to simplify this API - this should make this refactoring easier: #18713
test.bicep
Outdated
| @@ -0,0 +1,3 @@ | |||
| var config = loadYamlContent('test.yaml') | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like these files were checked in accidentally - could you remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I removed the local test files.
Description
Fixes parsing errors for multi-document YAML files by using YamlStream to load the YAML and explicitly check if the number of documents is greater than one. If multiple documents are detected, it triggers a multi-document specific error (BCP442) with the message: “Multi-document YAML files are not supported. Please use a single YAML document.”
This improves upon the previous generic BCP340 error, providing a more specific and actionable message for multi-document YAML files.
The change also adds a unit test in ObjectDeserializationTests.cs to verify that multi-document YAML triggers BCP442 and that no parsing token is returned.
Relates to #15117.what
Example Usage
Checklist
Microsoft Reviewers: Open in CodeFlow