-
Notifications
You must be signed in to change notification settings - Fork 125
Description
What happened?
When the latest commit of vortex is set up as a file format with datafusion it appears to be impossible to directly cast column types in a select statement:
The statement SELECT cast(id AS STRING) FROM 'data.vortex' fails with the following error:
External error: Failed to read Vortex file: home/andreas/Projects/wql/data.vortex:
No compute kernel to cast array vortex.primitive with dtype u32 to utf8?
It appears to happen with i64 and u32 and so presumably all other int types.
This seems to be a regression. The expected behaviour is that if there is no built-in way to cast the column while scanning the file this would be handled by simply having datafusion cast the data after it has been scanned.
Steps to reproduce
- Register vortex as a file format in datafusion:
let ctx = {
let s = SessionStateBuilder::new()
.with_file_formats(vec![
Arc::new(vortex_datafusion::VortexFormatFactory::new()),
])
.with_config(SessionConfig::from_env()?)
.with_runtime_env(Default::default())
.with_default_features()
.build();
SessionContext::new_with_state(s).enable_url_table()
};- Find any file with an integer column or create a minimal one with
ctx
.sql(r#"copy (select 1 as id) to 'example.vortex'"#)
.await
.unwrap()
.collect()
.await
.unwrap();- Try to cast this column to string in a select statement
ctx
.sql(r#"select cast(id as string) from 'example.vortex'"#)
.await
.unwrap()
.collect()
.await
.unwrap();At this point the following error is shown:
thread 'main' (52987) panicked at src/main.rs:53:10:
called `Result::unwrap()` on an `Err` value: External(Failed to read Vortex file: home/andreas/Temp/vortex-bug/example.vortex:
No compute kernel to cast array vortex.primitive with dtype i64 to utf8?
Backtrace:
disabled backtrace)
Full trace can be seen here:
trace.txt
Environment
vortex-datafusion: commit #a4a2f0d7
datafusion: v52.1.0
os: Linux 6.18.4-zen1 (NixOS unstable)
This seems to be a regression specifically with vortex-datafusion from the main branch and datafusion v52. It does not occur with vortex-datafusion v0.58.0 and datafusion 51.
Additional context
No response