Skip to content

Conversation

@ogzhanolguncu
Copy link
Contributor

@ogzhanolguncu ogzhanolguncu commented Nov 6, 2025

What does this PR do?

This PR:

  • Improves log parsing logic a bit
  • Adds a way to prevent retries when terminal errors happen such as "wrong dockerfile path".
  • Moves the files around to make them a bit more pleasant to look
  • Adds terminal error list
  • Adds retry for our own db queries.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • Enhancement (small improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How should this be tested?

  • Run your docker setup
docker compose -f ../deployment/docker-compose.yaml up mysql planetscale clickhouse redis s3 gw ctrl krane clickhouse -d --build
  • Make a project and grab your projectID
  • Make a local deployment
  • Verify that you have logs in the CH
  • You can also compare against depot.dev build logs to verify the integrity
  • Now make a faulty deployment easiest way to do is update your dockerfile, just do this:
From:
COPY --from=builder /app/main .
To:
COPY --from=builder /app/something_stupid_power_move_slay .

This is enough to break depot flow.

  • Now, make sure Restate doesn't retry and you don't end up with duplicated logs

@changeset-bot
Copy link

changeset-bot bot commented Nov 6, 2025

⚠️ No Changeset found

Latest commit: ec59a17

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Nov 6, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
dashboard Ready Ready Preview Comment Nov 18, 2025 11:36am
engineering Ready Ready Preview Comment Nov 18, 2025 11:36am

This was referenced Nov 6, 2025
Copy link
Contributor Author

ogzhanolguncu commented Nov 6, 2025

@ogzhanolguncu ogzhanolguncu mentioned this pull request Nov 6, 2025
6 tasks
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 6, 2025

📝 Walkthrough

Walkthrough

Introduce centralized terminal-vs-retriable build error classification and wrapping, add retry-wrapped DB ops, and refactor BuildKit status handling to buffer, sort, and emit chronological events via a larger status channel; update deploy workflow to treat terminal build errors as permanent failures.

Changes

Cohort / File(s) Summary
Error handling
go/apps/ctrl/services/build/backend/depot/errors.go
New file: adds terminal error pattern matching (isTerminalBuildError) and a wrapBuildError helper that produces a Restate terminal error for matched messages or a Connect error otherwise.
Build creation & status processing
go/apps/ctrl/services/build/backend/depot/create_build.go
Replaced inline error construction with wrapBuildError; introduced buildStatusChannelBuffer = 1000; renamed/refactored processBuildStatusprocessBuildStatusLogs, changed to buffer status events and deterministically emit ordered vertex completions and logs (sorting by timestamp), removed per-status immediate log emission.
DB retries in depot project flow
go/apps/ctrl/services/build/backend/depot/create_build.go (getOrCreateDepotProject)
Wrapped project lookup and depot_project_id update calls in WithRetryContext to retry transient DB errors; update path now executes inside retry context (preserves prior retry semantics).
Deploy workflow handling
go/apps/ctrl/workflows/deploy/deploy_handler.go
Detect terminal build errors via restate.IsTerminalError(err) during Docker build step; treat terminal errors as permanent failures (log and return non-retriable error), while preserving retry behavior for non-terminal errors.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant BuildSvc as Build Service
    participant StatusBuf as Status Buffer
    participant DB
    participant ErrorWrap as wrapBuildError
    participant Restate

    Client->>BuildSvc: Request CreateBuild
    BuildSvc->>DB: WithRetryContext(Get/Update depot project)
    DB-->>BuildSvc: OK / Err

    BuildSvc->>BuildSvc: Start Solve / connect BuildKit
    BuildSvc->>StatusBuf: Send status events (buffered, channel size = 1000)
    StatusBuf->>StatusBuf: Accumulate events
    StatusBuf->>StatusBuf: Sort events by timestamp
    StatusBuf->>BuildSvc: Emit ordered vertex completions & logs

    alt Any failure encountered
        BuildSvc->>ErrorWrap: wrapBuildError(err, code, msg)
        alt ErrorWrap returns TerminalError
            ErrorWrap->>Restate: TerminalError (no retry)
            Restate->>Client: Permanent failure
        else Non-terminal
            ErrorWrap->>Client: Connect error (retryable)
            Client->>Restate: Retry via restate
        end
    else Build completes successfully
        BuildSvc->>Client: Success
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay special attention to:
    • create_build.go: correctness of buffered status accumulation, sorting order, memory/latency implications, and channel sizing usage.
    • errors.go: terminal pattern coverage and false-positive/negative risks in pattern matching.
    • getOrCreateDepotProject retry semantics: idempotency of updates and correct error propagation.
    • deploy_handler.go: integration of terminal detection with existing retry and logging behavior.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main feature: preventing retries when terminal errors occur, which aligns with the primary changes in the changeset.
Description check ✅ Passed The description provides comprehensive context including what the PR does, type of change, and detailed testing instructions; all required sections are completed with meaningful content.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch deploy-logs/error-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2025

Thank you for following the naming conventions for pull request titles! 🙏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f0b61e and 4046eef.

📒 Files selected for processing (3)
  • go/apps/ctrl/services/build/backend/depot/create_build.go (7 hunks)
  • go/apps/ctrl/services/build/backend/depot/errors.go (1 hunks)
  • go/apps/ctrl/workflows/deploy/deploy_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3564
File: go/cmd/cli/commands/deploy/deploy.go:71-73
Timestamp: 2025-07-15T15:11:30.840Z
Learning: In the go/cmd/cli/commands/deploy/ CLI codebase, ogzhanolguncu uses a UI-layer error handling pattern where errors are displayed through ui.PrintError() and ui.PrintErrorDetails(), followed by returning nil to indicate the error has been handled at the UI level, rather than propagating errors up the call stack.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3564
File: go/cmd/cli/commands/deploy/deploy.go:153-158
Timestamp: 2025-07-16T09:18:45.379Z
Learning: In the go/cmd/cli/commands/deploy/ CLI codebase, ogzhanolguncu prefers to allow deployment to continue even when Docker push fails (around lines 153-158 in deploy.go) because the team is working locally and needs this behavior for local development workflows where registry access might not be available.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3297
File: apps/dashboard/lib/trpc/routers/authorization/roles/query.ts:210-323
Timestamp: 2025-06-04T20:13:12.060Z
Learning: The user ogzhanolguncu prefers explicit, duplicated code over abstracted helper functions when it improves readability, even if it means some duplication in filter building functions in the authorization roles query module.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2825
File: apps/dashboard/app/(app)/logs-v2/hooks/use-bookmarked-filters.ts:0-0
Timestamp: 2025-01-30T20:51:44.359Z
Learning: The user (ogzhanolguncu) prefers to handle refactoring suggestions in separate PRs to maintain focus in the current PR.
📚 Learning: 2025-07-16T09:18:45.379Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3564
File: go/cmd/cli/commands/deploy/deploy.go:153-158
Timestamp: 2025-07-16T09:18:45.379Z
Learning: In the go/cmd/cli/commands/deploy/ CLI codebase, ogzhanolguncu prefers to allow deployment to continue even when Docker push fails (around lines 153-158 in deploy.go) because the team is working locally and needs this behavior for local development workflows where registry access might not be available.

Applied to files:

  • go/apps/ctrl/workflows/deploy/deploy_handler.go
📚 Learning: 2025-08-21T15:54:45.198Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3825
File: go/internal/services/usagelimiter/limit.go:38-0
Timestamp: 2025-08-21T15:54:45.198Z
Learning: In go/internal/services/usagelimiter/limit.go, the UpdateKeyCreditsDecrement operation cannot be safely wrapped with db.WithRetry due to the lack of idempotency mechanisms in the current tech stack. Retrying this non-idempotent write operation risks double-charging users if the first attempt commits but the client sees a transient error.

Applied to files:

  • go/apps/ctrl/services/build/backend/depot/create_build.go
🧬 Code graph analysis (1)
go/apps/ctrl/services/build/backend/depot/create_build.go (3)
go/pkg/db/retry.go (1)
  • WithRetryContext (36-46)
go/pkg/db/project_find_by_id.sql_generated.go (1)
  • FindProjectByIdRow (31-44)
go/pkg/db/project_update_depot_id.sql_generated.go (1)
  • UpdateProjectDepotIDParams (21-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
  • GitHub Check: detect_changes / Detect Changes
  • GitHub Check: autofix
🔇 Additional comments (10)
go/apps/ctrl/services/build/backend/depot/errors.go (2)

24-41: LGTM! Solid terminal error detection logic.

The case-insensitive pattern matching approach is appropriate for error classification, and the nil check prevents panics.


43-51: LGTM! Error wrapping logic correctly distinguishes terminal vs. retriable errors.

The function properly wraps errors and returns the appropriate type (Restate TerminalError or Connect error) based on the terminal status.

go/apps/ctrl/workflows/deploy/deploy_handler.go (1)

146-157: LGTM! Terminal error handling prevents unnecessary retries.

The implementation correctly distinguishes between terminal and non-terminal build errors, preventing retry pollution in logs for permanent failures (broken Dockerfile, missing files, etc.) while maintaining retry behavior for transient issues. The detailed logging helps with debugging.

go/apps/ctrl/services/build/backend/depot/create_build.go (7)

104-104: LGTM! Consistent error wrapping.

The switch to wrapBuildError enables proper terminal error detection for build creation failures.


128-128: LGTM! Consistent error wrapping.

The switch to wrapBuildError enables proper terminal error detection for machine acquisition failures.


145-145: LGTM! Consistent error wrapping.

The switch to wrapBuildError enables proper terminal error detection for BuildKit connection failures.


176-176: LGTM! Consistent error wrapping.

The switch to wrapBuildError enables proper terminal error detection for build execution failures, which is the primary goal of this PR.


233-235: LGTM! Safe retry wrapper for read operation.

Wrapping the FindProjectById query with WithRetryContext adds resilience for transient database errors. Read operations are naturally idempotent and safe to retry.


164-164: Unable to verify buffer size reduction claim; request manual verification.

The review comment claims the buffer was reduced from 100 to 10, but I found no evidence of this in the codebase. The current code shows a buffer size of 10, and git history is unavailable to confirm the prior size.

While the concern about channel buffer capacity is theoretically valid—a small buffer can cause blocking if the producer (buildkitClient.Solve) generates updates faster than the consumer (processBuildStatus) can process them—I cannot determine from the codebase alone whether:

  1. The buffer size was actually 100 previously
  2. The typical rate of status updates exceeds the consumer's throughput
  3. 10 is insufficient for typical workloads

The processBuildStatus function appears to perform straightforward processing (iterating through logs and marking completed vertices), suggesting low latency per status update, but without metrics or load testing data, the adequacy of the buffer size cannot be definitively confirmed.


276-286: No issues found. The operation is safe as implemented.

The UpdateProjectDepotID operation is confirmed idempotent:

  1. The Depot project is already created via external API before the UPDATE (line 269)
  2. The UPDATE is a fixed-value SET operation, not an increment/decrement
  3. Retrying with the same arguments produces the same database state
  4. Unlike the learnings case (credits decrement), this has no quantitative or financial side effects

The retry wrapper is appropriate here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4046eef and f6e3c13.

📒 Files selected for processing (1)
  • go/apps/ctrl/services/build/backend/depot/create_build.go (10 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3297
File: apps/dashboard/lib/trpc/routers/authorization/roles/query.ts:210-323
Timestamp: 2025-06-04T20:13:12.060Z
Learning: The user ogzhanolguncu prefers explicit, duplicated code over abstracted helper functions when it improves readability, even if it means some duplication in filter building functions in the authorization roles query module.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2825
File: apps/dashboard/app/(app)/logs-v2/hooks/use-bookmarked-filters.ts:0-0
Timestamp: 2025-01-30T20:51:44.359Z
Learning: The user (ogzhanolguncu) prefers to handle refactoring suggestions in separate PRs to maintain focus in the current PR.
📚 Learning: 2025-08-21T15:54:45.198Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3825
File: go/internal/services/usagelimiter/limit.go:38-0
Timestamp: 2025-08-21T15:54:45.198Z
Learning: In go/internal/services/usagelimiter/limit.go, the UpdateKeyCreditsDecrement operation cannot be safely wrapped with db.WithRetry due to the lack of idempotency mechanisms in the current tech stack. Retrying this non-idempotent write operation risks double-charging users if the first attempt commits but the client sees a transient error.

Applied to files:

  • go/apps/ctrl/services/build/backend/depot/create_build.go
🧬 Code graph analysis (1)
go/apps/ctrl/services/build/backend/depot/create_build.go (5)
go/pkg/db/retry.go (1)
  • WithRetryContext (36-46)
go/pkg/db/project_find_by_id.sql_generated.go (1)
  • FindProjectByIdRow (31-44)
go/pkg/db/project_update_depot_id.sql_generated.go (1)
  • UpdateProjectDepotIDParams (21-25)
go/apps/ctrl/services/build/backend/depot/service.go (1)
  • Depot (28-38)
go/pkg/clickhouse/schema/types.go (2)
  • BuildStepV1 (101-112)
  • BuildStepLogV1 (117-124)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
🔇 Additional comments (6)
go/apps/ctrl/services/build/backend/depot/create_build.go (6)

36-38: Good addition with clear documentation.

The increased buffer size and explanatory comment help prevent log loss for large Dockerfiles. The constant is well-documented.


109-109: Consistent error wrapping improves error handling.

The centralized error wrapping via wrapBuildError correctly classifies terminal vs. retriable build errors across all failure paths, aligning with the PR objective to prevent retries on terminal errors.

Also applies to: 133-133, 150-150, 181-181


238-240: Appropriate use of retry wrapper for read operation.

Wrapping the database query with WithRetryContext is safe and correct, as read operations have no side effects and can be retried on transient failures.


281-291: Retry wrapper is safe here due to idempotent update.

The retry wrapper is appropriate for this update operation because both depot_project_id and updated_at are computed before entering the retry block (line 280), making the operation idempotent. Multiple retries will set the same values without side effects.

Based on learnings


304-308: Clean data structure for event ordering.

The event struct provides a clear abstraction for collecting and sorting BuildKit events chronologically, supporting the refactored log processing logic.


310-397: Well-structured refactor for chronological event processing.

The refactored implementation correctly:

  • Buffers all BuildKit events to handle out-of-order delivery
  • Sorts events chronologically using stable sort for deterministic ordering when timestamps collide
  • Prevents duplicate vertex processing via the completed map
  • Handles edge cases like cached steps with instant completion

The approach of buffering before processing addresses the core issue of BuildKit's asynchronous event delivery.

Note: Error handling for ClickHouse buffering operations (BufferBuildStep, BufferBuildStepLog) is not present, but given this runs in a goroutine with no error propagation mechanism, this is a broader architectural consideration beyond this PR's scope.

@ogzhanolguncu
Copy link
Contributor Author

@Flo4604 I tried to add our uid instead of sha and it made the code a bit more crowded than it should be. Because, we need to track what sha actually point to which uid, I think its just not worth it. If that sha ever becomes problematic, I'm to going back and replacing it with uid. I say for now let's stick to what we have, wdyt?

@Flo4604
Copy link
Member

Flo4604 commented Nov 13, 2025

@Flo4604 I tried to add our uid instead of sha and it made the code a bit more crowded than it should be. Because, we need to track what sha actually point to which uid, I think its just not worth it. If that sha ever becomes problematic, I'm to going back and replacing it with uid. I say for now let's stick to what we have, wdyt?

fine with me

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
go/apps/ctrl/services/build/backend/depot/create_build.go (1)

168-169: Goroutine synchronization issue remains unaddressed.

As previously noted, the goroutine processing build status logs runs asynchronously without synchronization. CreateBuild may return before processBuildStatusLogs completes, potentially resulting in incomplete logs in ClickHouse.

🧹 Nitpick comments (1)
go/apps/ctrl/services/build/backend/depot/create_build.go (1)

34-37: Clarify the buffer drop behavior and sizing rationale.

The comment states logs will be "dropped" if the buffer is too low. In standard Go channels, a full buffered channel blocks the sender rather than dropping messages. If BuildKit has specific drop behavior, this should be documented. Additionally, provide rationale for choosing 1000 as the buffer size (e.g., based on empirical testing with large Dockerfiles).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6e3c13 and 266963b.

📒 Files selected for processing (1)
  • go/apps/ctrl/services/build/backend/depot/create_build.go (9 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: For repo unkeyed/unkey and PR review workflows: When imeyer comments "issue" on a thread, automatically create a thorough GitHub issue (sections: Summary, Impact, Where, Repro/Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and the specific comment, and assign the issue to imeyer.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3297
File: apps/dashboard/lib/trpc/routers/authorization/roles/query.ts:210-323
Timestamp: 2025-06-04T20:13:12.060Z
Learning: The user ogzhanolguncu prefers explicit, duplicated code over abstracted helper functions when it improves readability, even if it means some duplication in filter building functions in the authorization roles query module.
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2825
File: apps/dashboard/app/(app)/logs-v2/hooks/use-bookmarked-filters.ts:0-0
Timestamp: 2025-01-30T20:51:44.359Z
Learning: The user (ogzhanolguncu) prefers to handle refactoring suggestions in separate PRs to maintain focus in the current PR.
📚 Learning: 2025-08-21T15:54:45.198Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3825
File: go/internal/services/usagelimiter/limit.go:38-0
Timestamp: 2025-08-21T15:54:45.198Z
Learning: In go/internal/services/usagelimiter/limit.go, the UpdateKeyCreditsDecrement operation cannot be safely wrapped with db.WithRetry due to the lack of idempotency mechanisms in the current tech stack. Retrying this non-idempotent write operation risks double-charging users if the first attempt commits but the client sees a transient error.

Applied to files:

  • go/apps/ctrl/services/build/backend/depot/create_build.go
🧬 Code graph analysis (1)
go/apps/ctrl/services/build/backend/depot/create_build.go (5)
go/pkg/db/retry.go (1)
  • WithRetryContext (36-46)
go/pkg/db/project_find_by_id.sql_generated.go (1)
  • FindProjectByIdRow (31-44)
go/pkg/db/project_update_depot_id.sql_generated.go (1)
  • UpdateProjectDepotIDParams (21-25)
go/apps/ctrl/services/build/backend/depot/service.go (1)
  • Depot (28-38)
go/pkg/clickhouse/schema/types.go (1)
  • BuildStepLogV1 (117-124)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (4)
go/apps/ctrl/services/build/backend/depot/create_build.go (4)

237-239: Good addition of retry logic for database query.

Wrapping the FindProjectById query with db.WithRetryContext improves resilience against transient database errors. SELECT operations are inherently idempotent and safe to retry.


280-290: Retry logic for UPDATE is safe and correct.

The UpdateProjectDepotID operation is wrapped in db.WithRetryContext, which is appropriate here. Unlike non-idempotent operations (e.g., decrement), this UPDATE is idempotent—setting depot_project_id to the same value multiple times produces the same result. The struct{}{} return is necessary to satisfy the WithRetryContext[T any] signature.


303-345: Well-structured refactoring of build status processing.

The refactored processBuildStatusLogs correctly:

  • Tracks vertices with logs to populate the HasLogs field
  • Uses a completed map to prevent duplicate vertex processing
  • Includes defensive nil checks for vertices
  • Buffers both logs and vertex completions for efficient batch processing

The logic is sound and the implementation is clear.


108-108: Terminal error classification in wrapBuildError is correctly implemented and comprehensive.

The verification confirms that wrapBuildError properly distinguishes terminal from retriable errors. The implementation uses isTerminalBuildError which matches against 9 terminal error patterns covering all major scenarios: wrong dockerfile path ("no such file or directory"), broken dockerfile ("failed to compute cache key"), dockerfile parsing errors, authentication failures ("permission denied", "unauthenticated"), depot internal errors, region config issues, and build failures. Terminal errors are wrapped with restate.TerminalError() to prevent retries, while others use connect.NewError() to allow retries. The implementation is sound and aligns with Restate's error handling model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants