·
1 commit
to refs/heads/main
since this release
What's Changed π
π₯ Breaking Changes
- chore!: remove spark connect @universalmind303 (#5743)
β¨ Features
- feat: Improve model api typing @colin-ho (#5809)
- feat(deltalake): allow users to ignore deletion vectors on read @kevinzwang (#5758)
- feat: Capture UDF argument names and values in structured logging. @rohitkulshreshtha (#5771)
- feat: support split and merge jsonl/ndjson files @caican00 (#5695)
- feat(tools): add markdown-to-notebook converter for documentation @ykdojo (#5691)
- feat: support label_selector specification in ray actor/task creation @Jay-ju (#5042)
- feat: support native csv writer @stayrascal (#5706)
- feat: Update optional dependencies for daft[postgres] @desmondcheongzx (#5586)
- feat(lance): add distributed compaction for Lance @huleilei (#5699)
- feat: Use JSON Serialization for Plans in Subscribers @srilman (#5709)
- feat: extended hash function to take and hash multiple inputs @rahulkodali (#5692)
- feat: Add Apache Gravitino catalog in catalog module @shaofengshi (#5694)
- feat: Better errors when lazy imports fail @samstokes (#5753)
- feat: Added structured logging for UDF errors. @rohitkulshreshtha (#5688)
- feat: Add statistics info to ScanTask when reading Lance dataset @plotor (#5727)
- feat: dynamic batching per operator @universalmind303 (#5676)
- feat: Allow users to disable the suffix range request @TheR1sing3un (#5188)
- feat: Streaming sample by size @colin-ho (#5663)
- feat: Allow dashboard to show query canceled/failed/dead information when query exited abnormally @VOID001 (#5576)
- feat: Add Google AI provider with prompt @everettVT (#5640)
- feat: Add pow expression @kliwongan (#5237)
- feat: No truncate in
.collectpreview @colin-ho (#5632) - feat: sample api supports precise sampling by size params @caican00 (#5600)
- feat: audio file subtype @universalmind303 (#5602)
- feat(tos): enhance the retry logic to aware response @stayrascal (#5569)
- feat: emit selectivity metric to OTel in swordfish filter op @samstokes (#5584)
- feat: Adding otel logger collector for collecting UDF errors. @rohitkulshreshtha (#5624)
π Bug Fixes
- fix: support skip empty json/jsonl files @caican00 (#5660)
- fix: minor doc fix @yuchaoran2011 (#5814)
- fix: CountRows with Limit returns unexpected result when reading Lance dataset @plotor (#5550)
- fix: Check for missing dependencies in OpenAI provider @everettVT (#5747)
- fix: fix btree index invalid issue when reading lance for point lookup @caican00 (#5673)
- fix: Combine deltalake with unity extra @everettVT (#5785)
- fix: enhance unit tests @caican00 (#5787)
- fix: patch CVE-2025-66478 update next dependencies to 16.0.7 @everettVT (#5786)
- fix: use single consolidated progress bar in Jupyter notebooks @ykdojo (#5774)
- fix: CuPy β NumPy needs explicit conversion @Jay-ju (#5680)
- fix: Fix Pydantic cloudpickle serialization in Google Colab @ykdojo (#5705)
- fix: update AI integration tests for new Subscriber interface @ykdojo (#5763)
- fix(optimizer): Prevent limits from being pushed below explodes in non-top-level projections @desmondcheongzx (#5292)
- fix(io): load all splits in read_huggingface fallback path @ykdojo (#5757)
- fix(test): use read_huggingface instead of read_parquet for HF test @ykdojo (#5755)
- fix: add disk cleanup to nightly integration-test-io job @ykdojo (#5711)
- fix: Postgres overwrite table should enable RLS and set up pgvector automatically @desmondcheongzx (#5657)
- fix: make it easier to enable different logging levels @Abyss-lord (#5661)
- fix: Dashboard logo animation. @j3nkii (#5672)
- fix(ci): add disk cleanup to integration-test-ai job @ykdojo (#5733)
- fix: update hypothesis test to use new expression API @ykdojo (#5723)
- fix: Fix type annotation check on Python 3.14 @srilman (#5721)
- fix: add fallback mechanism for HuggingFace datasets without parquet files @ykdojo (#5650)
- fix: Import or skip lance @colin-ho (#5662)
- fix: Add missing trailing slashes to S3-compatible endpoint urls @desmondcheongzx (#5575)
- fix: Add outer try-finally block in executor generator @colin-ho (#5633)
- fix: test_explain @universalmind303 (#5656)
- fix: Unify the naming and type of URI parameter for Lance-related APIs @plotor (#5634)
- fix: Fix blocked and oom issues for scan lance @caican00 (#5592)
- fix: Executing
explainwill panic when ScanTask is empty @plotor (#5582) - fix: Embed text dropping texts @colin-ho (#5641)
- fix: limit(n) return n rows directly @caican00 (#5597)
- fix: Upgrade to deltalake 1.2.1 @colin-ho (#5580)
- fix: add disk cleanup to integration-test-io-credentialed job @ykdojo (#5610)
- fix: add disk cleanup to doctests job @ykdojo (#5609)
- fix: Hashable identifier @colin-ho (#5598)
π Performance
- perf: Lazy udf worker @colin-ho (#5542)
- perf: Use growable for build side @colin-ho (#5613)
- perf: optimize setting lance schema @Jay-ju (#5704)
β»οΈ Refactor
- refactor(arrow2): values_iter removals for primitive array @universalmind303 (#5802)
- refactor(arrow2): remove arrow2 from daft-functions-binary @universalmind303 (#5799)
- refactor(arrow2): remove deprecated usages from daft-functions-utf8 @universalmind303 (#5800)
- refactor(arrow2): remove deprecated methods from daft-functions-uri crate @universalmind303 (#5798)
- refactor(arrow2): remove arrow2 from
daft-functions-tokenize@universalmind303 (#5797) - refactor(arrow2): rename and deprecate
to_arrowandas_arrowfunctions @universalmind303 (#5796) - refactor: write empty dataframe to parquet/json files via native IO @stayrascal (#5682)
- refactor(arrow-rs): Move temporal conversions from arrow2 to arrow-rs @srilman (#5782)
- refactor(arrow-rs): Remove arrow2 Index generics usages @srilman (#5761)
- refactor(arrow2): rename and deprecate .to_arrow methods @universalmind303 (#5789)
- refactor(arrow): use arrow for ffi instead of arrow2 @universalmind303 (#5775)
- refactor(arrow): remove daft-arrow from daft-sql & daft-context crates @universalmind303 (#5773)
- refactor(arrow-rs): Move all validity
daft_arrow::bitmap::Bitmaps todaft_arrow::buffer::NullBuffer@srilman (#5750) - refactor: abstract MultipartWriter to write data to object store @stayrascal (#5702)
- refactor: Remove Unloaded MicroPartitions @srilman (#5710)
- refactor(arrow-rs): Add
daft-arrowmiddleman crate for Rust & Arrow usage @srilman (#5730)
π Documentation
- docs: Update slack invite @everettVT (#5813)
- docs: add logging settings @Jay-ju (#5671)
- docs: fix broken Bodo benchmark link @ykdojo (#5762)
- docs: add voice-analytics-example and update index @everettVT (#5737)
- docs: fix broken Lance documentation link @ykdojo (#5724)
- docs: remove redundant About Daft section from README @ykdojo (#5689)
- docs: remove redundant Table of Contents from README @ykdojo (#5684)
- docs: add Daft Cloud mentions to distributed execution docs @ykdojo (#5686)
- docs: fix quickstart connector links formatting @ykdojo (#5687)
- docs: update README to reflect AI/multimodal positioning @ykdojo (#5677)
- docs: Improve mkdocstrings template for Python examples rendering @ykdojo (#5642)
- docs: changed dev url to a live link to prevent 404 @j3nkii (#5669)
- docs: add Python version requirement to README @ykdojo (#5655)
- docs: update index overview page @ykdojo (#5627)
- docs: remove Python tabs from quickstart @ykdojo (#5626)
- docs: update contributor policy, add contributing section, remove old⦠@madvart (#5251)
- docs: add tip to find your dylib @universalmind303 (#5625)
- docs: add data persistence section to quickstart @ykdojo (#5607)
- docs: revamp quickstart with Amazon product dataset example @ykdojo (#5585)
β Tests
π· CI
- ci: increase unit-test timeout to 75 minutes for macOS @ykdojo (#5731)
- ci: exclude Kaggle from link checker @ykdojo (#5725)
π§ Maintenance
- chore(deps): bump the minor group across 1 directory with 45 updates @dependabot[bot] (#5734)
- chore: Provide query end state to
RuntimeStatsManageron query end @colin-ho (#5791) - chore: update uvlock to remove tensorflow @kevinzwang (#5780)
- chore: Codeowners @colin-ho (#5502)
- chore: Don't install pytorch in iceberg test docker compose @colin-ho (#5781)
- chore: remove arrow dep from common-image @universalmind303 (#5772)
- chore: Pin dependencies @colin-ho (#5667)
- chore!: remove spark connect @universalmind303 (#5743)
- chore: remove ir and proto crates @universalmind303 (#5742)
- chore(deps): bump actions/checkout from 5 to 6 in the all group @dependabot[bot] (#5713)
- chore(deps): bump ctor from 0.5.0 to 0.6.1 @dependabot[bot] (#5717)
- chore: Cleanup additional Ray runner artifacts @srilman (#5714)
- chore: Remove the old Ray Runner @srilman (#5375)
- chore: Remove expression namespaces @colin-ho (#5619)
- chore: Remove runner from context @colin-ho (#5628)
- chore: add deprecation to daft.udf @kevinzwang (#5665)
- chore(deps): bump the all group with 13 updates @dependabot[bot] (#5480)
- chore: update bug report template @universalmind303 (#5652)
- chore: remove checklist from pr template @universalmind303 (#5653)
- chore: Remove deprecated agg methods and series split @colin-ho (#5630)
- chore: remove broken docpublish job from build-docs workflow @ykdojo (#5631)
- chore: Hint users to use
.collectwhen printing empty dataframes @colin-ho (#5616) - chore: reduce binary size by feature flagging
derive(Debug)@universalmind303 (#5622) - chore: deprecate llm_generate function @universalmind303 (#5603)
- chore: reduce sdist size @universalmind303 (#5606)
β¬οΈ Dependencies
4 changes
- chore(deps): bump the minor group across 1 directory with 45 updates @dependabot[bot] (#5734)
- chore(deps): bump actions/checkout from 5 to 6 in the all group @dependabot[bot] (#5713)
- chore(deps): bump ctor from 0.5.0 to 0.6.1 @dependabot[bot] (#5717)
- chore(deps): bump the all group with 13 updates @dependabot[bot] (#5480)
Full Changelog: v0.6.14...v0.6.15