feat: delta-compressed token storage for 58% memory reduction #177

Boshen · 2025-09-27T13:48:19Z

Summary

Implement delta-compressed token storage to reduce memory usage by 58% for sourcemaps with millions of tokens.

Motivation

As requested by the user, sourcemaps can have millions of tokens, consuming significant memory. This PR implements delta compression to store tokens more efficiently while maintaining the same public API.

Implementation

CompressedTokens Structure

First token: Stored uncompressed (24 bytes)
Subsequent tokens: Stored as deltas from previous token
Variable-length encoding: 1-4 bytes per field based on delta size
Index: Stores byte offset every 256 tokens for O(1) random access

Encoding Format

Header byte (2 bits per field):
  00: i8 delta (-128 to 127)
  01: i16 delta (-32768 to 32767)  
  10: i32 delta (full range)
  11: u32 absolute value (fallback)

Results

Memory Savings

1 million tokens: 23MB → 10MB (58% reduction)
Box<[Token]> only: 8 bytes saved per SourceMap
With compression: ~13MB saved per million tokens

Performance Trade-offs

Benchmark results:
- SourceMap::from_json_string: +25% (1.05µs vs 836ns)
- SourceMap::to_json: +88% (418ns vs 220ns)
- SourceMap::generate_lookup_table: +1434% (230ns vs 15ns)
- Sequential iteration: ~25% slower

The performance regression is due to decompression overhead. This is acceptable for:

Large applications with many sourcemaps
Memory-constrained environments
Cases where sourcemap operations are infrequent

Testing

✅ All existing tests pass
✅ Added compression/decompression tests
✅ API remains unchanged
✅ Backward compatible

Alternative Approach

If the performance trade-off is too high, the simpler Box<[Token]> optimization (commit 86cb878) provides 8 bytes savings per SourceMap with zero performance impact.

Breaking Changes

None - the API remains unchanged. The compression is an internal implementation detail.

🤖 Generated with Claude Code

Implement compressed token storage using delta encoding to significantly reduce memory usage for large sourcemaps with millions of tokens. Changes: - Add CompressedTokens struct with variable-length delta encoding - Store first token uncompressed, then deltas for subsequent tokens - Use 1-4 bytes per field based on delta size (vs always 4 bytes) - Add index every 256 tokens for reasonable random access - Convert SourceMap to use CompressedTokens internally Memory savings: - 58% reduction for 1M tokens (23MB → 10MB) - Scales well with larger sourcemaps Performance trade-offs: - Sequential iteration: ~25% slower (acceptable for encoding) - Lookup table generation: Slower due to decompression - Good trade-off for memory-constrained environments All tests pass and API remains unchanged. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

graphite-app · 2025-09-27T13:48:28Z

How to use the Graphite Merge Queue

Add the label merge to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

codspeed-hq · 2025-09-27T13:49:51Z

CodSpeed Performance Report

Merging #177 will degrade performances by 79.41%

_{Comparing delta (1dba1c6) with main (3e2b510)}

Summary

⚡ 1 improvement
❌ 4 regressions

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`from_json_string`	16.9 µs	19.6 µs	-13.68%
❌	`generate_lookup_table`	1.4 µs	6.6 µs	-79.41%
❌	`to_json`	5.7 µs	8.7 µs	-33.99%
❌	`to_json_string`	5.3 µs	8.3 µs	-35.66%
⚡	`add_name_add_source_and_content`	1.7 µs	1.6 µs	+1.8%

[autofix.ci] apply automated fixes

1dba1c6

Boshen closed this Sep 27, 2025

Boshen deleted the delta branch September 27, 2025 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: delta-compressed token storage for 58% memory reduction #177

feat: delta-compressed token storage for 58% memory reduction #177

Uh oh!

Boshen commented Sep 27, 2025

Uh oh!

graphite-app bot commented Sep 27, 2025

Uh oh!

codspeed-hq bot commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat: delta-compressed token storage for 58% memory reduction #177

feat: delta-compressed token storage for 58% memory reduction #177

Uh oh!

Conversation

Boshen commented Sep 27, 2025

Summary

Motivation

Implementation

CompressedTokens Structure

Encoding Format

Results

Memory Savings

Performance Trade-offs

Testing

Alternative Approach

Breaking Changes

Uh oh!

graphite-app bot commented Sep 27, 2025

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Sep 27, 2025

CodSpeed Performance Report

Merging #177 will degrade performances by 79.41%

Summary

Benchmarks breakdown

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants