Skip to content

ref(vortex-io): unify cloud object storage api#8259

Open
m7kss1 wants to merge 7 commits into
vortex-data:developfrom
m7kss1:feat-cloud-object-store
Open

ref(vortex-io): unify cloud object storage api#8259
m7kss1 wants to merge 7 commits into
vortex-data:developfrom
m7kss1:feat-cloud-object-store

Conversation

@m7kss1
Copy link
Copy Markdown
Contributor

@m7kss1 m7kss1 commented Jun 4, 2026

Summary

Cloud object store construction was duplicated across vortex-jni, vortex-python,
vortex-duckdb each with different scheme coverage and credential behavior

New API

/// Resolve any URL or path
let vxf = session.open_options().open_url("s3://bucket/key/file.vortex").await?;

/// Typed result for callers that need the store directly
let (store, path) = FileLocation::resolve(url)?.into_remote()?;

CLI: now remote files work out of the box:

$ AWS_REGION=us-east-1 vx query s3://bucket/hits.vortex --sql "select count(*) from hits"
$ vx tree gs://bucket/hits.vortex
$ vx browse az://bucket/hits.vortex

Closes: #000

Testing

AI disclosure: XXX

@m7kss1 m7kss1 force-pushed the feat-cloud-object-store branch from 22a64e7 to bd6efc0 Compare June 4, 2026 22:48
m7kss1 added 4 commits June 5, 2026 06:58
Consolidate cloud storage builders and resolvers from multiple integrations
(JNI, DuckDB, datafusion-bench) into a single canonical core module.

New:
- vortex-io/src/object_store/{cloud,registry}.rs with Registry and resolve_url
- VortexOpenOptions::open_url dispatcher for CLI and library users
- 'cloud' umbrella feature gating S3, GCS, Azure, HTTP support

Migrate:
- JNI, DuckDB, datafusion-bench to use core make_object_store
- CLI commands to accept remote URLs (s3://, gs://, az://)

Improvements:
- Consistent credential/endpoint handling across integrations
- First-class remote file support in vortex query/tree/browse/segments

Signed-off-by: Maxim Dergousov <dergousovmaxim99@gmail.com>
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
@m7kss1 m7kss1 force-pushed the feat-cloud-object-store branch from bd6efc0 to 70f771d Compare June 5, 2026 06:59
m7kss1 added 2 commits June 5, 2026 07:06
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
@m7kss1 m7kss1 changed the title [WIP] ref(vortex-io): unify cloud object storage api ref(vortex-io): unify cloud object storage api Jun 5, 2026
@m7kss1
Copy link
Copy Markdown
Contributor Author

m7kss1 commented Jun 5, 2026

note: verified only on s3-compatible storages, azure and gcs remain untested for now

@AdamGS AdamGS self-assigned this Jun 5, 2026
@AdamGS
Copy link
Copy Markdown
Contributor

AdamGS commented Jun 5, 2026

Hi @m7kss1 thanks for your PR! I'll review it next week.

Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
@m7kss1 m7kss1 requested a review from a team June 5, 2026 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants