Skip to content

Conversation

@juliasloan25
Copy link
Member

@juliasloan25 juliasloan25 commented Dec 6, 2025

Purpose

We commonly use the pattern @. field = ifelse(area_fraction ≈ 0, zero(field), field). I was under the impression that the zero(field) would be equivalent to zero(eltype(field)) when broadcasted, but this isn't the case. See the allocation and timing information below

Content

  • replace zero(field) with 0 in all broadcasted ifelse calls
  • some other misc. cleanup of ifelse calls

Timing comparison

Despite having the most allocations, zero(FT) is the fastest to run since it the type of the zero doesn't need to be promoted (as it does for e.g. 0).

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(field1), field1)
  199.083 μs (3 allocations: 128 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(eltype(field1)), field1)
  198.125 μs (4 allocations: 160 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, 0, field1);
  222.667 μs (2 allocations: 96 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(FT), field1)
  174.708 μs (8 allocations: 288 bytes)

Allocation comparison

julia> @allocated @. field1 = ifelse(area_fraction == 0, zero(field1), field1)
128

julia> @allocated @. field1 = ifelse(area_fraction == 0, zero(eltype(field1)), field1)
160

julia> @allocated @. field1 = ifelse(area_fraction == 0, 0, field1)
96

The allocations are even worse when the whole expression isn't broadcasted, which we were doing in a few places:

julia> @allocated field1 .= ifelse.(area_fraction .== 0, zero(field1), field1)
6368

julia> @allocated field1 .= ifelse.(area_fraction .== 0, zero(eltype(field1)), field1)
96

julia> @allocated field1 .= ifelse.(area_fraction .== 0, 0, field1)
96

  • I have read and checked the items on the review checklist.

@juliasloan25
Copy link
Member Author

cc @petebachant @szy21 @kmdeck
We use this pattern a lot in the integrated land code, so I expect this will speed up that model. It was used in source too so the performance of everything should get better. We'll see by how much :)

Comment on lines 610 to 611
@. csf.scalar_temp1 = ifelse(area_fraction == 0, 0, csf.scalar_temp1)
@. csf.scalar_temp2 = ifelse(area_fraction == 0, 0, csf.scalar_temp2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this matter, but do we need to do something like zero(FT) or something similar to that?

Copy link
Member Author

@juliasloan25 juliasloan25 Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zero(FT) can be slightly more type-stable and doesn't require type promotion of the Int to FT. I'll switch to that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also profiled the different options and saw that zero(FT) is faster even though it has fewer allocations, probably because when we use 0 the type has to be promoted.

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(field1), field1)
  199.083 μs (3 allocations: 128 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(eltype(field1)), field1)
  198.125 μs (4 allocations: 160 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, 0, field1);
  222.667 μs (2 allocations: 96 bytes)

julia> @btime @. field1 = ifelse(area_fraction == 0, zero(FT), field1)
  174.708 μs (8 allocations: 288 bytes)

@juliasloan25 juliasloan25 changed the title use 0 instead of zero(field) in broadcasts use zero(FT) instead of zero(field) in broadcasts Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants