-
Notifications
You must be signed in to change notification settings - Fork 2k
[Server-Side Planning] Integrate DeltaCatalog with ServerSidePlannedTable and add tests #5622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Server-Side Planning] Integrate DeltaCatalog with ServerSidePlannedTable and add tests #5622
Conversation
| "DeltaCatalog", "loadTable") { | ||
| try { | ||
| super.loadTable(ident) match { | ||
| val table = super.loadTable(ident) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeltaCatalog.scala was refactored to AbstractDeltaCatalog.scala (commit 156e41f) so the loadTable changes are in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha
2ae8544 to
f099779
Compare
|
Rebased on latest master (now includes PR #5621 merge). The branch now has only 1 commit on top of the latest master. |
f099779 to
62c2baa
Compare
…able and add E2E tests (#9) * [Server-Side Planning] Integrate DeltaCatalog with ServerSidePlannedTable and add E2E tests Implements DeltaCatalog integration to use ServerSidePlannedTable when Unity Catalog tables lack credentials, with comprehensive end-to-end testing and improvements. Key changes: - Add loadTable() logic in DeltaCatalog to detect UC tables without credentials - Implement hasCredentials() to check for credential properties in table metadata - Add ENABLE_SERVER_SIDE_PLANNING config flag for testing - Add comprehensive integration tests with reflection-based credential testing - Add Spark source code references for Identifier namespace structure - Improve test suite by removing redundant aggregation test - Revert verbose documentation comments in ServerSidePlannedTable Test coverage: - E2E: Full stack integration with DeltaCatalog - E2E: Verify normal path unchanged when feature disabled - loadTable() decision logic with ENABLE_SERVER_SIDE_PLANNING config (tests 3 scenarios including UC without credentials via reflection) See Spark's LookupCatalog, CatalogAndIdentifier and ResolveSessionCatalog for Identifier namespace structure references. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Refactor PR9: Extract decision logic and improve test quality Changes: - Extracted shouldUseServerSidePlanning() method with boolean inputs - Replaced reflection-based test with clean unit test - Tests all input combinations without brittle reflection code - Improved testability and maintainability 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Clean up ServerSidePlannedTable: Remove unnecessary helper function - Remove misleading "Fallback" comment that didn't apply to all cases - Inline create() function into tryCreate() to reduce indirection - Simplify logic: directly handle client creation in try-catch 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Remove unnecessary logging and unused imports from ServerSidePlannedTable - Remove conditional logging that differentiated between forced and fallback paths - Remove unused imports: MDC and DeltaLogKeys 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Remove useless AutoCloseable implementation from ServerSidePlannedTable The close() method was never called because: - Spark's Table interface has no lifecycle hooks - No code explicitly called close() on ServerSidePlannedTable instances - No try-with-resources or Using() blocks wrapped it HTTP connection cleanup happens via connection timeouts (30s) and JVM finalization, making AutoCloseable purely ceremonial dead code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Simplify verbose test suite comments Reduced 20+ line formatted comments to simple 2-line descriptions. The bullet-pointed lists were over-documenting obvious test structure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Remove useless forTesting() wrapper method The forTesting() method was just a wrapper around 'new' that added no value. Tests now directly instantiate ServerSidePlannedTable with the constructor. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Fix brittle test assertion for table capabilities Removed assertion that table has exactly 1 capability, which would break if we add streaming support (MICRO_BATCH_READ, CONTINUOUS_READ) or other non-write capabilities later. Now tests what actually matters: supports BATCH_READ, does NOT support BATCH_WRITE. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Remove redundant SupportsWrite interface check in test Testing !isInstanceOf[SupportsWrite] is redundant with checking !capabilities.contains(BATCH_WRITE) because: - BATCH_WRITE capability requires SupportsWrite interface - Having SupportsWrite without BATCH_WRITE would violate Spark contract The capability check tests the public API contract, which is sufficient. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Remove e2e/integration terminology from ServerSidePlanningSuite Changed: - Test names: removed "E2E:" prefix - Database name: integration_db → test_db - Table name: e2e_test → planning_test These tests use mock clients, not external systems, so e2e/integration terminology was misleading. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Merge test suites: keep existing ServerSidePlannedTableSuite, delete new ServerSidePlanningSuite ServerSidePlanningSuite was added in this PR, while ServerSidePlannedTableSuite existed before. Merged them by keeping the existing file and deleting the new one, so the PR shows modification rather than deletion+addition. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Refactor tests and improve documentation - Refactor ServerSidePlannedTableSuite: create database/table once in beforeAll() - Use minimal early-return pattern in DeltaCatalog with oss-only markers - Move current snapshot limitation docs to ServerSidePlanningClient interface - Add UC credential injection link to hasCredentials() documentation - Lowercase test names and remove redundant client verification 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Simplify current snapshot limitation documentation Shorten documentation from 4 lines to 1 concise line. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Address PR9 review feedback Changes: 1. Remove unnecessary afterEach() cleanup - no resource leaks to prevent 2. Make test 2 explicit by setting config=false instead of relying on cleanup 3. Remove redundant test "loadTable() decision logic" - already covered by other tests 4. Add explanation for deltahadoopconfiguration scalastyle suppression Tests: 4/4 passing, scalastyle clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Address PR9 review comments: improve test quality and cleanup - Add withServerSidePlanningEnabled helper method to prevent test pollution - Encapsulates factory and config setup/teardown in one place - Guarantees cleanup in finally block - Prevents tests from interfering with each other - Replace white-box capability check with black-box insert test - Test actual insert behavior instead of inspecting capabilities - Verifies inserts succeed without SSP, fail with SSP enabled - More realistic end-to-end test of read-only behavior - Remove OSS-only marker comments from DeltaCatalog - Clean up // oss-only-start and // oss-only-end comments - Remove unused import (DeltaCatalog) All tests passing (4/4): - full query through DeltaCatalog with server-side planning - verify normal path unchanged when feature disabled - shouldUseServerSidePlanning() decision logic - ServerSidePlannedTable is read-only 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>
62c2baa to
c94cf91
Compare
| val catalogName = if (ident.namespace().length > 1) { | ||
| ident.namespace().head | ||
| } else { | ||
| "spark_catalog" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for future PR: we should test for table/catalog/schema names that have special chars like hyphens. we found the hard way in uc-spark connector when a user complained.
🥞 Stacked PR
Use this link to review all changes.
Stack:
Summary
Integrates server-side planning into DeltaCatalog.loadTable():
Decision Logic
ServerSidePlanning is used when:
ENABLE_SERVER_SIDE_PLANNINGconfig is true (force flag for testing)Key Changes
DeltaCatalog.loadTable(): Add ServerSidePlannedTable.tryCreate() callServerSidePlannedTable.shouldUseServerSidePlanning(): Decision logicDeltaSQLConf.ENABLE_SERVER_SIDE_PLANNINGconfigTesting
4 tests covering:
Note: Tests verified in fork. Upstream master has kernel-api compilation issues (pre-existing, not related to these changes).