feat: auto-detect input format from file extension with -I override#164
Merged
Conversation
- Add input_format_explicit flag to track when -I is explicitly set - When -I is set, it overrides file extension auto-detection for all files - When -I is not set, auto-detect from .csv/.tsv/.json/.ndjson/.xml extensions - Ambiguous extensions (.txt, .dat) default to CSV - Stdin always uses -I value (no filename to inspect) - Add 8 integration tests (157a-157h) covering auto-detection and override - Update fixture test 14 to use file auto-detection instead of stdin + -I - Document auto-detection and -I override in README, man page, and --help Closes #158
…hecks Compute effective_input_format from per-file auto-detection when a file argument is present, and use it for: - --columns, --validate, --sample mode dispatch (Issue A) - --json-path validation (Issue B) - --xml-root/--xml-row name validation (Issue C) Previously these paths used the global input_format (default CSV or explicit -I value), causing auto-detected .tsv/.json/.xml files to be parsed as CSV in special modes and valid --json-path invocations to be rejected.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #158
What
When a file argument has a recognizable extension (
.csv,.tsv,.json,.ndjson,.xml), the input format is auto-detected — no-Iflag needed. The-Iflag still works as an explicit override for all files (e.g. when a TSV file has a.txtextension).Changes
src/args.zig: Track whether-I/--input-formatwas explicitly set (input_format_explicit). When set, it takes precedence over extension-based detection for all file arguments. When not set, each file's extension is inspected viaInputFormat.fromExtension(), falling back to CSV for unrecognized extensions.build.zig: 8 new integration tests (157a–157h) covering auto-detection for JSON/NDJSON/XML,-Ioverride (short, long,=syntax), ambiguous extensions (.txt), and updated fixture test 14 to use auto-detection instead of piping.docs/sql-pipe.1.scd: Document auto-detection behavior,-I/-Ooptions, and new examples.README.md: Add JSON auto-detection example,-Ioverride example, update options table and limitations section.Acceptance Criteria
.csv,.tsv,.json,.ndjson,.xmlauto-set input format-Iflag still works as explicit override.txt,.dat) default to CSV