Skip to content

Conversation

@jopemachine
Copy link
Member

@jopemachine jopemachine commented Dec 19, 2025

resolves #7523 (BA-3516)

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

@jopemachine jopemachine changed the title feat(BA-3516): Integrate ModifyScalingGroup action with existing GQL resolver refactor(BA-3516): Integrate ModifyScalingGroup action with existing GQL resolver Dec 19, 2025
@github-actions github-actions bot added size:M 30~100 LoC comp:manager Related to Manager component labels Dec 19, 2025
@jopemachine jopemachine force-pushed the feat/BA-3516 branch 2 times, most recently from e4e95b0 to 9741bf3 Compare January 5, 2026 02:11
Base automatically changed from feat/BA-3486 to main January 6, 2026 08:20
@github-actions github-actions bot added size:L 100~500 LoC and removed size:M 30~100 LoC labels Jan 7, 2026
@jopemachine jopemachine marked this pull request as ready for review January 7, 2026 06:19
Copilot AI review requested due to automatic review settings January 7, 2026 06:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the ModifyScalingGroup GraphQL mutation to use the new action/repository pattern instead of direct database operations. The refactor integrates the existing ModifyScalingGroupAction with the GraphQL resolver and adds resilience handling to the repository layer.

Key Changes:

  • Added resilience decorator to update_scaling_group method in the repository
  • Converted ModifyScalingGroupInput to use an Updater pattern via new to_updater() method
  • Replaced direct SQL update with action-based approach using ModifyScalingGroupAction

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/ai/backend/manager/repositories/scaling_group/repository.py Added @scaling_group_repository_resilience.apply() decorator to update_scaling_group method for consistent error handling
src/ai/backend/manager/api/gql_legacy/scaling_group.py Refactored ModifyScalingGroup mutation to use action pattern; added to_updater() method to convert GraphQL input to repository updater spec
changes/7524.enhance.md Added changelog entry documenting the integration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +626 to +635
is_active=(
OptionalState.update(self.is_active)
if self.is_active is not None
else OptionalState.nop()
),
is_public=(
OptionalState.update(self.is_public)
if self.is_public is not None
else OptionalState.nop()
),
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual is not None check is incorrect for handling GraphQL optional fields. When a GraphQL optional field is not provided, it has the value Undefined (not None), but this check will still evaluate to True for Undefined values, causing unintended updates.

The codebase has a consistent pattern of using OptionalState.from_graphql() which properly handles Undefined (not provided) vs actual values. This should be:

status_spec = ScalingGroupStatusUpdaterSpec(
    is_active=OptionalState.from_graphql(self.is_active),
    is_public=OptionalState.from_graphql(self.is_public),
)

The same pattern is used correctly throughout the codebase, for example in container_registry.py and endpoint.py.

Suggested change
is_active=(
OptionalState.update(self.is_active)
if self.is_active is not None
else OptionalState.nop()
),
is_public=(
OptionalState.update(self.is_public)
if self.is_public is not None
else OptionalState.nop()
),
is_active=OptionalState.from_graphql(self.is_active),
is_public=OptionalState.from_graphql(self.is_public),

Copilot uses AI. Check for mistakes.
Comment on lines +638 to +659
description=(
TriState.update(self.description)
if self.description is not None
else TriState.nop()
),
)
network_spec = ScalingGroupNetworkConfigUpdaterSpec(
wsproxy_addr=(
TriState.update(self.wsproxy_addr)
if self.wsproxy_addr is not None
else TriState.nop()
),
wsproxy_api_token=(
TriState.update(self.wsproxy_api_token)
if self.wsproxy_api_token is not None
else TriState.nop()
),
use_host_network=(
OptionalState.update(self.use_host_network)
if self.use_host_network is not None
else OptionalState.nop()
),
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual is not None check is incorrect for handling GraphQL optional fields. The description, wsproxy_addr, and wsproxy_api_token fields can be explicitly set to null in GraphQL (which should nullify the field) or not provided at all (which should leave it unchanged).

Use TriState.from_graphql() which properly handles all three states:

  • Undefined (field not provided) → nop() (no change)
  • None (explicitly set to null) → nullify() (set to null)
  • Actual value → update(value) (update with value)

This should be:

metadata_spec = ScalingGroupMetadataUpdaterSpec(
    description=TriState.from_graphql(self.description),
)
network_spec = ScalingGroupNetworkConfigUpdaterSpec(
    wsproxy_addr=TriState.from_graphql(self.wsproxy_addr),
    wsproxy_api_token=TriState.from_graphql(self.wsproxy_api_token),
    use_host_network=OptionalState.from_graphql(self.use_host_network),
)

This pattern is consistently used throughout the codebase in similar mutations.

Suggested change
description=(
TriState.update(self.description)
if self.description is not None
else TriState.nop()
),
)
network_spec = ScalingGroupNetworkConfigUpdaterSpec(
wsproxy_addr=(
TriState.update(self.wsproxy_addr)
if self.wsproxy_addr is not None
else TriState.nop()
),
wsproxy_api_token=(
TriState.update(self.wsproxy_api_token)
if self.wsproxy_api_token is not None
else TriState.nop()
),
use_host_network=(
OptionalState.update(self.use_host_network)
if self.use_host_network is not None
else OptionalState.nop()
),
description=TriState.from_graphql(self.description),
)
network_spec = ScalingGroupNetworkConfigUpdaterSpec(
wsproxy_addr=TriState.from_graphql(self.wsproxy_addr),
wsproxy_api_token=TriState.from_graphql(self.wsproxy_api_token),
use_host_network=OptionalState.from_graphql(self.use_host_network),

Copilot uses AI. Check for mistakes.
Comment on lines 661 to 684
driver_spec = ScalingGroupDriverConfigUpdaterSpec(
driver=(
OptionalState.update(self.driver)
if self.driver is not None
else OptionalState.nop()
),
driver_opts=(
OptionalState.update(self.driver_opts)
if self.driver_opts is not None
else OptionalState.nop()
),
)
scheduler_spec = ScalingGroupSchedulerConfigUpdaterSpec(
scheduler=(
OptionalState.update(self.scheduler)
if self.scheduler is not None
else OptionalState.nop()
),
scheduler_opts=(
OptionalState.update(ScalingGroupOpts.from_json(self.scheduler_opts))
if self.scheduler_opts is not None
else OptionalState.nop()
),
)
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual is not None check is incorrect for handling GraphQL optional fields. The driver, driver_opts, scheduler, and scheduler_opts fields should use OptionalState.from_graphql() to properly distinguish between field not provided (Undefined) and field explicitly set.

This should be:

driver_spec = ScalingGroupDriverConfigUpdaterSpec(
    driver=OptionalState.from_graphql(self.driver),
    driver_opts=OptionalState.from_graphql(self.driver_opts),
)
scheduler_spec = ScalingGroupSchedulerConfigUpdaterSpec(
    scheduler=OptionalState.from_graphql(self.scheduler),
    scheduler_opts=OptionalState.from_graphql(
        ScalingGroupOpts.from_json(self.scheduler_opts) 
        if self.scheduler_opts is not None 
        else None
    ),
)

Note: For scheduler_opts, the value transformation to ScalingGroupOpts.from_json() needs to be done before passing to from_graphql(), but only when the value is not None.

Copilot uses AI. Check for mistakes.
Comment on lines 663 to 665
OptionalState.update(self.driver)
if self.driver is not None
else OptionalState.nop()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you take a look? We were using Undefined from GraphQL in that nop operation. @jopemachine

@github-actions github-actions bot added size:M 30~100 LoC and removed size:L 100~500 LoC labels Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:M 30~100 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate ModifyScalingGroup Action with existing GraphQL implementation

3 participants