Skip to content

Conversation

@lukekim
Copy link

@lukekim lukekim commented Jan 29, 2026

Adds support for the "CASE WHEN" SQL expression to the Vortex expression system, including its conversion from DataFusion, benchmarking, and pushdown logic. The main focus is on enabling CASE WHEN expressions to be parsed, converted, and benchmarked, while ensuring only supported forms are handled.

Support for CASE WHEN expressions:

  • Added a new module case_when to vortex-array's expression system and re-exported its functions, enabling construction and evaluation of CASE WHEN and nested CASE WHEN expressions. (vortex-array/src/expr/exprs/mod.rs) [1] [2]
  • Registered the new CaseWhen expression in the ExprSession so it can be used in expression evaluation. (vortex-array/src/expr/session.rs) [1] [2]

DataFusion integration and conversion:

  • Implemented conversion from DataFusion's CaseExpr to Vortex's nested case_when expressions, with validation to only support the "searched CASE" form (not "simple CASE"). (vortex-datafusion/src/convert/exprs.rs) [1] [2] [3]
  • Updated the pushdown logic to recognize and validate CASE WHEN expressions, including recursive checks for convertible sub-expressions and else clauses. (vortex-datafusion/src/convert/exprs.rs) [1] [2]

Benchmarks and protocol updates:

  • Added a new benchmark suite for CASE WHEN expressions, covering simple, nested, all-true, and all-false scenarios with varying array sizes. (vortex-array/benches/expr/case_when_bench.rs, vortex-array/Cargo.toml) [1] [2]
  • Extended the protocol buffer definitions to include options for CASE WHEN expressions, specifying the number of when/then pairs and presence of an else clause. (vortex-proto/proto/expr.proto)

Bench:

Timer precision: 16 ns
expr_case_when                    fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ case_when_all_false                          │               │               │               │         │
│  ├─ 1000                        3.183 µs      │ 1.572 ms      │ 3.295 µs      │ 19.01 µs      │ 100     │ 100
│  ├─ 10000                       4.047 µs      │ 5.311 µs      │ 4.175 µs      │ 4.192 µs      │ 100     │ 100
│  ╰─ 100000                      12.36 µs      │ 17.1 µs       │ 12.52 µs      │ 12.6 µs       │ 100     │ 100
├─ case_when_all_true                           │               │               │               │         │
│  ├─ 1000                        3.167 µs      │ 4.047 µs      │ 3.311 µs      │ 3.324 µs      │ 100     │ 100
│  ├─ 10000                       4.095 µs      │ 7.407 µs      │ 4.191 µs      │ 4.234 µs      │ 100     │ 100
│  ╰─ 100000                      12.36 µs      │ 14.43 µs      │ 12.49 µs      │ 12.52 µs      │ 100     │ 100
├─ case_when_nested_3_conditions                │               │               │               │         │
│  ├─ 1000                        13.53 µs      │ 159.4 µs      │ 13.75 µs      │ 15.28 µs      │ 100     │ 100
│  ├─ 10000                       18.41 µs      │ 21.19 µs      │ 18.75 µs      │ 18.78 µs      │ 100     │ 100
│  ╰─ 100000                      203.6 µs      │ 424.2 µs      │ 236.5 µs      │ 252.2 µs      │ 100     │ 100
╰─ case_when_simple                             │               │               │               │         │
   ├─ 1000                        4.591 µs      │ 6.991 µs      │ 4.735 µs      │ 4.764 µs      │ 100     │ 100
   ├─ 10000                       6.415 µs      │ 9.471 µs      │ 6.527 µs      │ 6.567 µs      │ 100     │ 100
   ╰─ 100000                      147.5 µs      │ 184.5 µs      │ 153.3 µs      │ 153.5 µs      │ 100     │ 100

…13)

* feat: implement binary CASE WHEN expression with support for nested conditions
@AdamGS AdamGS added the action/benchmark-sql Trigger SQL benchmarks to run on this PR label Jan 29, 2026
@joseph-isaacs joseph-isaacs added action/benchmark Trigger full benchmarks to run on this PR and removed action/benchmark-sql Trigger SQL benchmarks to run on this PR labels Jan 30, 2026
@joseph-isaacs
Copy link
Contributor

joseph-isaacs commented Jan 30, 2026

Sorry we just merged a break: #6081. We will be updating the PR very soon with a migration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action/benchmark Trigger full benchmarks to run on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants