fix(splunk_hec): validate and flatten fields #24292
Open
+112
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The
splunk_hecsink now flattens objects and arrays. Presently Vector sends nested objects and arrays that the Splunk HEC api rejects. Integers also also silently dropped. Due to this behaviour, any indexed fields that are not at the top level of the Vector log event cannot be referenced.Flattening is preferable to silently dropping nested objects and arrays as there's no data loss. It is also particularly useful for indexing fields with unknown names at runtime. For example Kubernetes pod labels.
I'd also just like to preface that I am a Rust novice and this is my first PR, so apologies if I have missed the mark here.
Vector configuration
How did you test this PR?
Tested against my own Splunk environment with the above config.
Verified the valid field inputs using a local instance of Splunk. Although the Splunk documentation states only strings and string arrays are valid, I found the following types were accepted:
Integers are silently dropped from the fields.
Nested arrays or objects resulted in the event being rejected (400 response code).
I've maintained the previous Vector behaviour for all the existing working types (string, float, boolean, null) i.e. They do not get stringified. Integers are however now stringified so they do not get dropped.
Note: Splunk itself has no types once the event has been indexed so there is no side effect of stringifying.
I've also added additional test cases for various types.
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details here.