0.76.0 (Draft)#
Draft — GitHub Release
Changes#
⚠️ Breaks#
Move execute-parent kernels into session registry (#8482) @gatesn
break: add
Validity::definitely_all_null()method (#8447) @joseph-isaacsMake
Maskexecution strict about nullable booleans (#8121) @joseph-isaacsSupport reporting statistics in spark datasource (#8057) @robert3005
✨ Features#
feat: add metadata bridge for Python CUDA export (#8604) @0ax1
feat(file): make Footer::new public (#8602) @tobias-fire
feat(array): push struct validity into children (#8589) @miniex
Support session-time source configuration in DataFusion, and clear up precedence between config sources (#8575) @AdamGS
feat(python): package CUDA as an optional extension (#8510) @0ax1
Add JSON to Parquet Variant conversion with shredding (#8391) @AdamGS
Numerical aggregate functions have an option to skip or include nans in calculation, skip by default (#8457) @robert3005
feat(vortex-geo): ST_Distance via the
geocrate +Polygontype (#8497) @HarukiMoriartyfeat(vortex-datafusion): struct scalar conversion + extension-over-struct scan (#8453) @HarukiMoriarty
feat: DeltaScheme adjustable compression ratio threshold (#8461) @joseph-isaacs
Add Delta scheme to the integer compressor (unstable_encodings) (#8212) @joseph-isaacs
Push down some expressions to Dict layout reader’s cached values (#8341) @myrrc
Constant comparison and byte_length OnPair kernels (#8371) @myrrc
feat(vortex-geo): Arrow import/export for the native Point type (#8374) @HarukiMoriarty
🚀 Performance#
perf(datafusion): push down
list_lengthexpression (#8600) @mhk197perf: cache static identifiers to avoid per-call interner locks (#8614) @miniex
perf(duckdb): push down list length expressions (#8544) @mhk197
Spawn arrow conversion in jni bindings (#8595) @robert3005
Compute zone layout stats concurrently when writing (#8594) @robert3005
Report DuckDB max cardinality for exact scans (#8582) @gatesn
perf(scan): intra-file decode parallelism — sub-split large chunk spans (#8400) @lukekim
perf(alp): avoid per-combination allocations when searching exponents (#8565) @miniex
Optimize interleave boolean gather (#8350) @joseph-isaacs
perf: align buffer to
Alignment::DEFAULT_ALIGNMENT = Alignment::new(256)(#8490) @joseph-isaacsFaster take for runend array (#8228) @robert3005
perf: don’t use fill_null for executing an array into a mask (#8466) @joseph-isaacs
perf[gpu]: export arrow device validity on the gpu (#8440) @0ax1
perf[buffer]: iteration for fallible operations with validity (#8120) @joseph-isaacs
perf[gpu]: generate FixedSizeList offsets on device (#8458) @0ax1
perf(buffer): read byte-aligned bit operands directly as words (#8436) @miniex
🐛 Bug Fixes#
20 changes
fix(vortex-bench): map gs:// scheme to gcs storage label (#8630) @brancz
Fix wasm32 build by gating MultiFileSession on non-wasm targets (#8612) @robert3005
Guard calling is_nan on scalar value on dtype being a float (#8593) @robert3005
vortex-datafusion: Pipe session through to converter (#8591) @brancz
Save children dtypes/length in ZstdBuffersMetadata (#8572) @myrrc
fix(array): don’t panic on unsupported arrow types (#8564) @miniex
fix(layout): don’t panic collecting an empty stream (#8472) @miniex
Correctly calculate FSST compressed output size (#8551) @robert3005
refactor[fsst]: take ArrayRef in fsst_compress, decide offset width upfront (#7900) @mprammer
Move optimizer kernels into a dedicated KernelSession (#8511) @gatesn
Fix is_zero() for FixedSizeList and Struct scalar types (#8500) @connortsui20
fix: preserve operand width in DecimalValue checked arithmetic (#7380) @abnobdoss
vortex-tui: query tab should work for non-*.vortex files (#8473) @a10y
Ignore PyO3 rustsec (#8381) @connortsui20
fix:
vx_data_source_new_buffersem-merge conflict (#8437) @joseph-isaacs
📖 Documentation#
Expanding conceptual docs (and some other minor docs) (#8552) @AdamGS
docs: update doc links (#8375) @joseph-isaacs
🧰 Maintenance#
49 changes
chore: forbid the locking Id constructors with a clippy lint (#8617) @miniex
Patches have correct dtype by construction instead of normalised during array construction (#8626) @robert3005
chore: Python CUDA bridge: CI and buffer handoff ABI (#8618) @0ax1
Lock file maintenance (#8616) @renovate[bot]
Polish treemap view in vortex explorer (#8613) @robert3005
Migrate WASM Arrow conversion to use VortexSession (#8619) @robert3005
Lock file maintenance (#8615) @renovate[bot]
Remove ArrayAccessor (#8603) @robert3005
Add benchmarks for decimal casting (#8569) @robert3005
Use AllNonDistinct in assert_arrays_eq and implement it for variant types (#8546) @robert3005
Refactor DictEncoder to not use ArrayAccessors (#7759) @robert3005
Remove LEGACY_SESSION from benchmarks and back compat tests (#8554) @robert3005
Bump codspeed to 5.0 (#8553) @robert3005
Remove usages of LEGACY_SESSION from tests (#8547) @robert3005
assert_arrays_eq and assert_nth_scalar require ExecutionCtx (#8509) @robert3005
Update Rust crate memmap2 to v0.9.11 [SECURITY] (#8545) @renovate[bot]
Update lance benchmark dependencies (major) (#8525) @renovate[bot]
Update actions/setup-java digest to ad2b381 (#8521) @renovate[bot]
Update anthropics/claude-code-action digest to 2fee155 (#8522) @renovate[bot]
Update taiki-e/install-action digest to 9e1e580 (#8523) @renovate[bot]
Update actions/checkout action to v7 (#8524) @renovate[bot]
Lock file maintenance JS lock file maintenance (#8536) @renovate[bot]
Lock file maintenance Rust lock file maintenance (#8537) @renovate[bot]
Make sure both pushdown features are always enabled for DF benchmarks (#8507) @AdamGS
Generalize SIMD
takeoperations toCopy(#8496) @connortsui20Change
ALIGNMENT_TO_HOST_COPYtoAlignment::HOST_COPY(#8488) @AdamGSTouch up some of the interfaces and docs in vortex-datafusion (#8485) @AdamGS
Add benchmarks for take on runend array (#8469) @robert3005
Include vortex-compute in codspeed benchmarks (#8477) @robert3005
Upgrade pyO3 to 0.29.0 (#8462) @robert3005
Update dependency starlette to v1.3.1 [SECURITY] (#8435) @renovate[bot]
fmt: normalize_comments = true (#8428) @joseph-isaacs
Update taiki-e/install-action digest to 7a79fe8 (#8409) @renovate[bot]
Update gradle/actions digest to 3f131e8 (#8408) @renovate[bot]
Update anthropics/claude-code-action digest to d5726de (#8407) @renovate[bot]
Update dependency duckdb to v1.4.2 [SECURITY] (#8431) @renovate[bot]
Update dependency pip to v26.1.2 [SECURITY] (#8432) @renovate[bot]
Update plugin com.palantir.java-format to v2.93.0 (#8416) @renovate[bot]