-
Notifications
You must be signed in to change notification settings - Fork 1.9k
chore(performance): EventMetadata UUID generation optimizations #24345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: refactor-proto-event-metadata
Are you sure you want to change the base?
chore(performance): EventMetadata UUID generation optimizations #24345
Conversation
v7 are created using a timestamp so they can be ordered, however this comes at a performance cost. We currently don't need to order these UUID's, so for now we can use v4.
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| dropped_fields: ObjectMap::new(), | ||
| datadog_origin_metadata: None, | ||
| source_event_id: Some(Uuid::now_v7()), | ||
| source_event_id: Some(Uuid::new_v4()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align source_event_id test with v4 UUID default
Default metadata generation now uses Uuid::new_v4() to populate source_event_id (metadata.rs line 256), but the metadata_set_unique_uuid_v7_source_event_id unit test still asserts the auto-generated ID is a v7 (SortRand) UUID in lib/vector-core/src/event/log_event.rs. Running the test suite will now fail because new events return v4 UUIDs, so either revert the default or update the test/consumers to accept v4.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semi valid comment. Just need to update the test name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tracing-subscriber = { version = "0.3.20", default-features = false, features = ["fmt"] } | ||
| url = { version = "2.5.4", default-features = false, features = ["serde"] } | ||
| uuid = { version = "1.18.1", features = ["v4", "v7", "serde"] } | ||
| uuid = { version = "1.18.1", features = ["v4", "v7", "serde", "fast-rng"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove "v7" from the features list here now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait the Websocket server sink still uses v7.. it compiles because v7 is enabled in transitive dependencies eg. vrl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to add it back because it seems like it actually needs the v7 generation there for time ordered replays and we shouldn't rely on transitive dependencies to enable the feature
| tracing-subscriber = { version = "0.3.20", default-features = false, features = ["fmt"] } | ||
| url = { version = "2.5.4", default-features = false, features = ["serde"] } | ||
| uuid = { version = "1.18.1", features = ["v4", "v7", "serde"] } | ||
| uuid = { version = "1.18.1", features = ["v4", "v7", "serde", "fast-rng"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/uuid-rs/uuid/blob/7527cef15f39fe493a92bda19d28eebec2c73ebf/src/lib.rs#L109-L111
Should be part of the crate's default features IMO...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess it is isn't because it requires bringing in more dependencies, and if your not generating UUIDs on a hot path you wouldn't be caring about the internal lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH I wonder how many CPU cycles are wasted because people never included this dependency/feature lol. Pretty much any decently sized project will already depend on rand
Either way it's a very nice addition
[Websocket server sink uses it](https://github.com/vectordotdev/vector/blob/72e09673fda9d6fbf933adacea1220bdfae162a8/src/sinks/websocket_server/buffering.rs#L235) for time ordered replays in case the connection drops
Summary
This PR fixes some severe performance issues which were discovered when load testing and profiling the disk buffer v2 implementation. UUID's were added to event metadata in #21074, and while regression tests were run on that branch before merging, there weren't any regression tests which test disk buffer performance. While adding UUID's to events is inconsequential in most pipeline configurations, disk buffers were disproportionately affected by this change due to events being serialized to/deserialized from the buffer, which lead to an additional (immediately overwritten) UUID being generated per event upon deserialization (see fix in #24336). This PR enables the
fast-rngfeature inUUIDwhich speeds up generation significantly, and switches the generator to v4 which further speeds up generation by foregoing v7's timestamp ordering.All of these changes combined lead to a ~40% increase in bandwidth through a multi-threaded disk buffer load test with identical test settings.
Finally, this PR adds a regression test for disk buffers so we can test for disk buffer performance regressions in future contributions.
Flame graph of a thread before UUID optimizations:

After UUID Optimizations:

Vector configuration
How did you test this PR?
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Related: #24336
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details here.