-
Notifications
You must be signed in to change notification settings - Fork 673
Fixes C# benchmark test failures caused by table naming convention mismatches #4059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/update-llm-benchmark |
1 similar comment
|
/update-llm-benchmark |
a7a0578 to
7c7e552
Compare
|
/update-llm-benchmark |
499787a to
aac04ab
Compare
|
/update-llm-benchmark |
LLM Benchmark Results (ci-quickfix)
Compared against master branch baseline Generated at: 2026-01-20T01:46:16.880Z Failure Analysis (click to expand)Benchmark Failure AnalysisGenerated from: Summary
AnalysisAnalysis of SpacetimeDB Benchmark Test Failures: Rust and C#Rust Failures1. Root Causes
2. Recommendations
3. Priority
C# Failures1. Root Causes
2. Recommendations
3. Priority
These actionable insights, with specific attention to documentation updates, can help mitigate the current benchmark failures and improve user experience in both Rust and C# implementations of the SpacetimeDB. |
|
/update-llm-benchmark |
|
I am going to have to force merge this PR. The reason is that this PR changes how the context for the LLMs is calculated, in particular it now takes a mode (e.g. rust_doc vs docs) instead of just language as a way to compute the context. However, the current CI check uses the ci-check defined on the PR, whereas I have fixed this oversight in this PR, but since github runs the workflow from master only, there's no way for me to have it be fixed until this merges. |
|
/update-llm-benchmark |
LLM Benchmark Results (ci-quickfix)
Compared against master branch baseline Generated at: 2026-01-20T17:48:37.135Z Failure Analysis (click to expand)Benchmark Failure AnalysisGenerated from: Summary
AnalysisSpacetimeDB Benchmark Failure AnalysisRust1. Root CausesA. Compile/Publish Errors
B. Timeout Issues
C. Other Failures
2. RecommendationsA. Update Visibility in Documentation
B. Initialization Documentation
C. Schema and Data Parity
3. Priority
C#1. Root CausesA. Timeout Issues
B. Other Failures
C. Publish Errors
2. RecommendationsA. Update Example Structures
B. Initialization and Seed Documentation
C. Publishing and Configuration Examples
3. Priority
SummaryEach language has similar root causes primarily related to visibility and initialization issues with structs and tables. The suggestions focus on adjusting the documentation to enhance clarity, provide better examples, and ensure correct struct accessibility, ultimately improving the robustness of the SpacetimeDB benchmarks and reducing common failure points. |
|
/update-llm-benchmark |
LLM Benchmark Results (ci-quickfix)
Compared against master branch baseline Generated at: 2026-01-20T18:26:47.651Z Failure Analysis (click to expand)Benchmark Failure AnalysisGenerated from: Summary
AnalysisAnalysis of SpacetimeDB Benchmark FailuresRust Failures (41 total)1. Root Causes:
2. Recommendations:
3. Priority:
C# Failures (17 total)1. Root Causes:
2. Recommendations:
3. Priority:
By addressing these root issues and recommendations, we can enhance the usability of SpacetimeDB across both Rust and C#, thereby improving the success rate of benchmark tests and reducing developer frustration. |
jdetter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed all of the documentation changes - I didn't review the benchmark code + answer changes as I trust you've double-checked those.
All optional nits - let me know if you need another review 🙂
docs/docs/00200-core-concepts/00300-tables/00210-file-storage.md
Outdated
Show resolved
Hide resolved
|
/update-llm-benchmark |
LLM Benchmark Results (ci-quickfix)
Compared against master branch baseline Generated at: 2026-01-21T01:34:57.629Z Failure Analysis (click to expand)Benchmark Failure AnalysisGenerated from: Summary
AnalysisAnalysis of SpacetimeDB Benchmark FailuresThis analysis breaks down the failures encountered in the Rust and C# benchmarks of SpacetimeDB, highlighting the root causes, recommended documentation updates, and priorities. Rust Failures1. Root Causes
2. Recommendations
3. Priority
C# Failures1. Root Causes
2. Recommendations
3. Priority
By implementing these specific documentation changes, SpacetimeDB can improve both its usability and reliability, addressing the issues highlighted in the benchmarks for both Rust and C#. |
- Update table_name() to convert lowercase singular names to appropriate case per language (C#: PascalCase, Rust: snake_case) - Update all spec.rs files to use table_name() instead of hardcoded names - Update Rust task prompts to use singular table names (users → user) - Update Rust golden answers to use singular table/struct names and accessor methods (ctx.db.users() → ctx.db.user()) This fixes the C# benchmark test failures caused by table name mismatches where the LLM generates "User" but tests query for "users".
- Add compute_processed_context_hash() for language-specific hash computation after tab filtering is applied - Update CI check to verify both rustdoc_json and docs modes for Rust - Fix hash-only mode to skip golden builds - Update benchmark analysis with latest results
The llm-benchmark-update workflow builds the tool from master to save benchmark results. The CI check must also use master's tool to compute hashes the same way, otherwise hash mismatches occur when the PR branch has different hash computation logic.
- Add detection for WASI SDK tar extraction failures (wasi-sdk + tar, MSB3073 with exit code 2) as transient errors that should be retried - Increase max retries from 2 to 3 for build commands - Add 1-3 seconds of jitter to retry delays to desynchronize parallel builds that fail simultaneously, reducing the chance of repeated collisions
…missions - Add pagination_next to React quickstart to fix "Next" button linking to itself - Fix Rust type table: Duration → TimeDuration to match example code - Fix TypeScript optional column indexing to use table-level syntax
Documents that module-level state does not persist across reducer calls and that all persistent state must be stored in tables.
Covers inline binary storage, external storage with references, and hybrid approaches for thumbnails. Includes practical examples for all three server languages.
Adds a practical Quick Start example showing how to: - Set up subscriptions - Use row callbacks (onInsert, onDelete, onUpdate) - Understand the subscription flow Examples provided for TypeScript, C#, and Rust clients.
Co-authored-by: John Detter <4099508+jdetter@users.noreply.github.com> Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
b0732c8 to
721eae2
Compare
|
/update-llm-benchmark |
LLM Benchmark Results (ci-quickfix)
Compared against master branch baseline Generated at: 2026-01-21T03:05:32.938Z Failure Analysis (click to expand)Benchmark Failure AnalysisGenerated from: Summary
AnalysisAnalysis of SpacetimeDB Benchmark Test FailuresRust FailuresRoot Causes
Recommendations
Priority
C# FailuresRoot Causes
Recommendations
Priority
ConclusionBoth languages suffer from structural and accessibility issues in their respective code samples, leading to a myriad of runtime and compilation problems. Prioritizing documentation fixes based on accessibility and naming conventions will significantly improve usability and reduce failures in benchmarks. |
Description of Changes
Adds TypeScript as a third language for LLM benchmark tests alongside Rust and C#, and fixes table naming convention mismatches.
TypeScript Support:
Lang::TypeScriptvariant with camelCase naming conventionstemplates/typescript/server/) with package.json, tsconfig.json, and index.tsspacetime buildandspacetime publishtasks/typescript.txtfilesTable Naming Fix:
user(snake_case singular)User(PascalCase singular)user(camelCase singular)table_name()helper to convert singular names to appropriate case per languagetable_name("user", lang)instead of hardcoded"users"CI/Hashing Improvements:
compute_processed_context_hash()for language-specific hash computation after tab filteringrustdoc_jsonanddocsmodes for Rust--hash-onlymode to skip golden buildsAPI and ABI breaking changes
None - these are internal benchmark tooling changes only.
Expected complexity level and risk
Complexity: 2
The changes add a new language following existing patterns for Rust and C#. The table naming fixes are straightforward find-and-replace style updates. Low risk since this only affects the benchmark tooling, not the core SpacetimeDB codebase.
Testing
cargo build -p xtask-llm-benchmarkcompiles successfully