buffer allocators #6166

onursatici · 2026-01-27T16:09:14Z

this PR introduces buffer allocators to the scan, which the caller can provide its own impl to scan directly into device buffer handles.

TODO:

device buffer alignment

example cuda scan using this:

let session = VortexSession::default();
let allocator = Arc::new(HostToDeviceAllocator::from_session(&session)?);

let file = session
    .open_options()
    .with_allocator(allocator)
    .open_path("data.vortex")
    .await?;

file.scan()?.into_array_stream()?

joseph-isaacs · 2026-01-27T16:17:46Z

vortex-io/src/write_target.rs

+use vortex_error::VortexResult;
+
+/// A destination for I/O reads that can be finalized into a [`BufferHandle`].
+pub trait WriteTarget: Send + 'static {


write destination

joseph-isaacs · 2026-01-27T16:19:17Z

vortex-file/src/open.rs

@@ -326,6 +343,22 @@ mod tests {
            self.inner.read_at(offset, length, alignment)
        }


joseph-isaacs · 2026-01-27T16:25:56Z

vortex-io/src/write_target.rs

+use vortex_error::VortexResult;
+
+/// A destination for I/O reads that can be finalized into a [`BufferHandle`].
+pub trait WriteTarget: Send + 'static {


This is a CPU only thing?

Signed-off-by: Onur Satici <[email protected]> Signed-off-by: Onur Satici <[email protected]>

Signed-off-by: Onur Satici <[email protected]>

onursatici · 2026-01-27T17:47:22Z

vortex-io/src/read.rs

    }
 }

+impl<T: VortexReadAt + Clone> VortexReadAt for AllocatingReadAt<T> {


we would not have this wrapper normally but push the allocators to the leaf ReadAt impl's. This is here to keep the diff small and get the device copy working when using a device allocator

Signed-off-by: Onur Satici <[email protected]>

codspeed-hq · 2026-01-27T17:50:31Z

CodSpeed Performance Report

Merging this PR will degrade performance by 33.09%

_{Comparing os/gpu-scan (d665f60) with develop (176e340)}

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

⚡ 8 improved benchmarks
❌ 38 regressed benchmarks
✅ 1219 untouched benchmarks
⏩ 1219 skipped benchmarks¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	WallTime	`u8_FoR[10M]`	6.4 µs	9.6 µs	-33.09%
❌	Simulation	`bench_compare_primitive[(10000, 128)]`	165.9 µs	184.8 µs	-10.24%
❌	Simulation	`bench_compare_primitive[(10000, 2)]`	162 µs	180.9 µs	-10.47%
❌	Simulation	`bench_compare_primitive[(10000, 32)]`	162.9 µs	181.8 µs	-10.42%
❌	Simulation	`bench_compare_primitive[(10000, 4)]`	161.3 µs	180.3 µs	-10.52%
❌	Simulation	`bench_compare_primitive[(100000, 128)]`	903.7 µs	1,094.1 µs	-17.4%
❌	Simulation	`bench_compare_primitive[(10000, 8)]`	161.7 µs	180.6 µs	-10.49%
❌	Simulation	`bench_compare_primitive[(100000, 2)]`	899.4 µs	1,089.9 µs	-17.47%
❌	Simulation	`bench_compare_primitive[(100000, 2048)]`	1 ms	1.2 ms	-15.96%
❌	Simulation	`bench_compare_primitive[(100000, 4)]`	899.6 µs	1,090 µs	-17.47%
❌	Simulation	`bench_compare_primitive[(100000, 32)]`	901.4 µs	1,091.8 µs	-17.44%
❌	Simulation	`bench_compare_primitive[(100000, 512)]`	961.6 µs	1,152 µs	-16.53%
❌	Simulation	`bench_compare_primitive[(100000, 8)]`	900 µs	1,090.4 µs	-17.46%
❌	Simulation	`bench_compare_varbin[(10000, 32)]`	170.7 µs	190.3 µs	-10.3%
❌	Simulation	`bench_compare_varbin[(10000, 2)]`	166.3 µs	185.9 µs	-10.53%
❌	Simulation	`bench_compare_varbin[(10000, 4)]`	166.8 µs	186.4 µs	-10.52%
❌	Simulation	`bench_compare_varbin[(10000, 8)]`	167.2 µs	186.8 µs	-10.49%
❌	Simulation	`bench_compare_varbin[(100000, 128)]`	921 µs	1,112 µs	-17.18%
❌	Simulation	`bench_compare_varbin[(100000, 2)]`	904.1 µs	1,095.2 µs	-17.45%
❌	Simulation	`bench_compare_varbin[(100000, 2048)]`	1.2 ms	1.4 ms	-13.53%
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

1219 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

vortex-cuda/src/device_buffer.rs

Signed-off-by: Joe Isaacs <[email protected]>

joseph-isaacs reviewed Jan 27, 2026

View reviewed changes

Base automatically changed from os/align to develop January 27, 2026 16:48

onursatici and others added 3 commits January 27, 2026 17:43

allocators

11351bf

Signed-off-by: Onur Satici <[email protected]> Signed-off-by: Onur Satici <[email protected]>

no pools yet

5764ffe

Signed-off-by: Onur Satici <[email protected]>

device buffer alignment + read region

613e21c

Signed-off-by: Onur Satici <[email protected]>

onursatici force-pushed the os/gpu-scan branch from 029ee85 to 613e21c Compare January 27, 2026 17:44

onursatici commented Jan 27, 2026

View reviewed changes

rename

d665f60

Signed-off-by: Onur Satici <[email protected]>

joseph-isaacs reviewed Jan 28, 2026

View reviewed changes

vortex-cuda/src/device_buffer.rs Outdated Show resolved Hide resolved

update

069d998

Signed-off-by: Joe Isaacs <[email protected]>

onursatici closed this Feb 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer allocators #6166

buffer allocators #6166

Uh oh!

onursatici commented Jan 27, 2026

Uh oh!

joseph-isaacs Jan 27, 2026

Uh oh!

joseph-isaacs Jan 27, 2026

Uh oh!

joseph-isaacs Jan 27, 2026

Uh oh!

onursatici Jan 27, 2026

Uh oh!

codspeed-hq bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -326,6 +343,22 @@ mod tests {
		self.inner.read_at(offset, length, alignment)
		}

buffer allocators #6166

buffer allocators #6166

Uh oh!

Conversation

onursatici commented Jan 27, 2026

Uh oh!

joseph-isaacs Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

joseph-isaacs Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

joseph-isaacs Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

onursatici Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging this PR will degrade performance by 33.09%

Summary

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codspeed-hq bot commented Jan 27, 2026 •

edited

Loading