Skip to content

Conversation

@kaxil
Copy link
Member

@kaxil kaxil commented Jan 19, 2026

Remove server-side computation of upstream_map_indexes from the Execution API. The Task SDK now computes these values locally when resolving XCom arguments for mapped task groups.

Previously, every /run request loaded the full SerializedDAG (5-50MB) just to compute upstream_map_indexes for XCom resolution in mapped task groups. This contributed to memory pressure on the API server.

Now the Task SDK computes these values lazily when resolving XCom arguments, using the existing GetTICount API to query task instance counts.

Screenshots from tests:
image

image image
Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:task-sdk labels Jan 19, 2026
@kaxil kaxil changed the title Move upstream_map_indexes computation to Task SDK Reduce API server memory usage by eliminating SerializedDAG loads on task start Jan 19, 2026
@kaxil kaxil marked this pull request as ready for review January 19, 2026 22:42
@kaxil kaxil requested a review from jedcunningham January 19, 2026 22:42
@kaxil kaxil added the full tests needed We need to run full set of tests for this PR to merge label Jan 19, 2026
@kaxil kaxil force-pushed the move-upstream-map-indexes-to-sdk branch from 869eb6a to 804d0a7 Compare January 19, 2026 23:03
@kaxil kaxil marked this pull request as draft January 19, 2026 23:25
@kaxil kaxil force-pushed the move-upstream-map-indexes-to-sdk branch from 804d0a7 to fa76ca6 Compare January 19, 2026 23:51
@kaxil kaxil marked this pull request as ready for review January 19, 2026 23:52
…n task start

Remove server-side computation of upstream_map_indexes from the Execution
API. The Task SDK now computes these values locally when resolving XCom
arguments for mapped task groups.

This eliminates the need to load SerializedDAG on the API server for the
/run endpoint, reducing memory usage and improving performance.

Changes:
- Add task_mapping.py with SDK-side computation logic
- Remove upstream_map_indexes field from TIRunContext
- Add Cadwyn migration for backward compatibility with older SDKs
- Update xcom_arg.py and expandinput.py for lazy computation
Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Didn't realise we had a todo in there

@kaxil kaxil merged commit c441e4b into apache:main Jan 20, 2026
128 checks passed
@kaxil kaxil deleted the move-upstream-map-indexes-to-sdk branch January 20, 2026 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:task-sdk full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants