feat(waterdata): get_queryables + queryables monitor + passthrough enablement#333
Draft
thodson-usgs wants to merge 2 commits into
Draft
feat(waterdata): get_queryables + queryables monitor + passthrough enablement#333thodson-usgs wants to merge 2 commits into
thodson-usgs wants to merge 2 commits into
Conversation
Add `waterdata.get_queryables(collection)`, returning the OGC queryable properties of a Water Data collection (`daily`, `continuous`, `monitoring-locations`, ...) as a tidy `(DataFrame, BaseMetadata)` — one row per filterable property with its type, title, and description. Add `tests/waterdata_queryables_test.py`: offline parsing / error tests plus a live monitor that compares each collection's advertised queryables against a committed snapshot (`tests/data/waterdata_queryables.json`). The monitor fails when the upstream API adds / removes / renames a queryable — the signal to regenerate the snapshot and enable any new queryables on the matching getter. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Sjb14HkwuCydKSKMsaXsgd
The OGC data getters (`get_daily`, `get_continuous`, `get_peaks`, ...) exposed ~11 of each collection's ~50 queryables as named params; the rest — mostly the shared monitoring-location attributes (`state_name`, `county_code`, `site_type`, `altitude`, ...) now filterable on the data endpoints — were reachable only via the raw `filter` CQL. Accept any queryable as a passthrough kwarg: each OGC getter gains `**queryables`, and the shared `_get_args` flattens it so an extra filter such as `state_name="Wisconsin"` is normalized and sent exactly like a named param. The service itself validates names (an unknown one returns HTTP 400 → typed error), so no client-side queryable list is bundled. The passthrough is provisional (see the PR description for the trade-off vs. explicit per-property keyword arguments). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Sjb14HkwuCydKSKMsaXsgd
46100eb to
2060c6c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three related changes around the Water Data OGC API's queryables (the
properties each collection can be filtered on):
waterdata.get_queryables(collection)— returns a collection's queryableproperties as a tidy
(DataFrame, BaseMetadata), one row per property withits
type,title, anddescription. Lets callers discover the availablefilters programmatically.
A live monitoring test —
tests/waterdata_queryables_test.pycompareseach collection's advertised queryables against a committed snapshot
(
tests/data/waterdata_queryables.json, 489 properties across 11collections). It fails when the upstream API adds / removes / renames a
queryable — the signal to regenerate the snapshot and enable anything new.
Passthrough enablement — the OGC data getters exposed ~11 of each
collection's ~50 queryables as named params; the rest (mostly the shared
monitoring-location attributes —
state_name,county_code,site_type,altitude, …, now filterable on the data endpoints) were reachable only viathe raw
filterCQL. Each OGC getter now accepts**queryables, so anyqueryable can be passed as a filter:
How the passthrough works
get_daily,get_continuous,get_latest_continuous,get_latest_daily,get_field_measurements,get_field_measurements_metadata,get_peaks,get_channel,get_monitoring_locations,get_time_series_metadata, andget_combined_metadataeach gain**queryables. The sharedwaterdata.utils._get_argsflattens that kwargs dict into the request args, so apassthrough filter is normalized (iterables → comma-joined, etc.) and sent
exactly like a named param.
get_cql(the raw-CQL escape hatch) is intentionallyexcluded.
No client-side queryable list is bundled: the service validates names itself —
an unknown queryable returns HTTP 400, surfaced as the typed
DataRetrievalError. (The committed snapshot is used only by the monitoringtest, not for runtime validation, so it can't drift the package.)
Provisional — passthrough now, explicit named params later?
This PR uses a passthrough (
**queryables). That decision is deliberate butnot final:
Why passthrough now
~400-param explosion of mostly-shared location attributes).
it and it's already usable — no per-getter code change to expose it.
_get_argschange enables every getter uniformly.Why we may switch to explicit named params
help(get_daily);**queryableshides them.type hint instead of one generic note.
TypeErrorat the call site;a misspelled passthrough queryable is only caught at runtime as an HTTP 400.
collection supports rather than "anything the service accepts."
The natural future step is to generate explicit params (with docstrings)
from the queryables snapshot, getting discoverability without hand-maintaining
~400 params. Until then, the passthrough unblocks the capability with minimal
surface area.
Verification
tests/waterdata_queryables_test.py— offlineget_queryablesparsing /error tests, offline passthrough tests (the filter reaches the
/itemsrequest, lists comma-joined), and the 11-collection live monitor. All pass.
ruff check/ruff format/mypy --strictclean across the package.get_dailyis unchanged by the**queryablesaddition;a passthrough
state_name=filter is accepted by the service (no 400).Before merge (once upstream is ready)
round-trip in the data.
NEWS.mdentry.🤖 Generated with Claude Code