Bug fixes and enhancements#139
Merged
Merged
Conversation
…l-data paths. Persistence paths that relied on CurrentCulture silently corrupted numeric data on non-US locales: a USGS daily value of "1234.56" parsed by double.TryParse on de-DE returned 123456.0 (treating "." as a group separator) rather than failing. USGS, GHCN, BOM, and the embedded Sobol resource always emit US-format numbers, so any non-US-locale user importing them today silently received NaN or wrong values. - TimeSeriesDownload.cs: add NumberStyles + InvariantCulture / DateTimeStyles to USGS daily, USGS peak, GHCN, and BOM parsing - TimeSeries.cs: fallback DateTime.TryParse in the XElement constructor now uses InvariantCulture (matches the "o" round-trip primary path) - SobolSequence.cs, LinkController.cs: int.TryParse now uses InvariantCulture for embedded-resource and XML-attribute reads - MCMCDiagnostics.cs: replace the wasteful decimal.Parse(N.ToString()) / 100M round-trip with direct double math, eliminating both the string conversion and a latent CurrentCulture dependency Display code (Summary tables, ParameterToString, DisplayLabel, exception messages) intentionally continues to use CurrentCulture so users see numbers in their locale's format. XML/JSON serialization layers were already correct and unchanged. Added a new method in MCMCResults so that parameter results can be recomputed if the credible interval width changes.
…silently treating it as 0. A user reported that running peaks-over-threshold analysis in RMC-BestFit on GHCN precipitation data with missing days produced wrong results when smoothing was set to a multi-day moving sum: missing days were silently treated as zero precipitation, producing spurious sub-threshold exceedances and biased annual sums. The defect spanned MovingSum, MovingAverage, and the Sum/Average paths in CalendarYearSeries and CustomYearSeries, all of which masked NaN with 0 in the running aggregation. The defect was inconsistent within the library too: MonthlySeries and QuarterlySeries already propagated NaN naively, so the same input produced different answers depending on which block method was called. This change adopts the convention used by every major data-analysis package (pandas Series.rolling min_periods, bottleneck move_sum min_count, xarray rolling, MATLAB movsum/movmean nanflag, R zoo rollsum/rollmean fallback): the default is strict NaN propagation — any NaN in the window or block produces NaN out. MovingSum and MovingAverage gain an optional minValidCount parameter (default = period) that maps directly to pandas's min_periods, letting power users opt into skip-NaN aggregation; in that mode MovingAverage divides by the count of observed values rather than the window size, fixing the prior low-bias. Block methods (CalendarYearSeries, CustomYearSeries, MonthlySeries, QuarterlySeries) now consistently propagate NaN for Sum and Average, and guard against empty blocks so series with sparse months no longer throw on .Last(). Difference is unchanged — IEEE-754 subtraction already propagates NaN. Adds NaN-coverage tests for moving-window, block, and POT pipelines that were previously absent.
…x offset-bearing timestamp parsing, and stop silently masking BOM API drift. A user reported that BOM integration tests were "all passing" only because BomAvailable() was returning false on a single transient KiWIS 500/DatasourceError, which the test-side _serviceAvailability cache then memoized as "down" for the rest of the run, silently short-circuiting every BOM test. The same hazards lived in CHMN, USGS, and GHCN — none had retry on transient 5xx, all four ran every download through an IsConnectedToInternet() probe that hit USGS specifically (so a USGS slowdown looked like "no internet" to BOM/CHMN/GHCN callers), and the connectivity probe didn't even check IsSuccessStatusCode. Two correctness bugs were also lurking: BOM and CHMN real-time timestamps include explicit offsets like "+10:00" or "-08:00", and DateTime.TryParse with DateTimeStyles.None converts those to local time per Microsoft's documented behavior, shifting daily series by a calendar day on any non-AEST/non-PST machine; and the USGS peak-data parser checked segments.Count() >= 5 before indexing segments[6] for PeakStage, which would IndexOutOfRange on short historical rows. This change adds a single GetWithRetryAsync helper (three attempts, 0.5s/1s/2s backoff, retries 5xx/408/429/network errors and KiWIS DatasourceError bodies) used by all four providers, switches the global connectivity check to https://www.google.com/generate_204 with a success-status check, replaces DateTime.TryParse with DateTimeOffset.TryParse for BOM and CHMN real-time timestamps, deletes the BOM silent ts_id fallback that was quietly returning hourly data when the caller asked for daily, fixes the USGS peak bounds check, URL-encodes the spaces in the CHMN real-time URL, and gives the test-side AvailableAsync cache a 30-second TTL on negative results so a single transient flake no longer disables a provider's whole test suite.
…function calls and likelihod calls in the base classes. Improving tests to catch the sentinal returns. Improving NUTS sampler to handle double.NegativeInfinity in the gradient.
…ootstrap method, the pivotal bootstrap. Added tests for each.
…ap methods. Refactored pivotal bootstrap classes to align with existing architecture.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds new statistical functionality, improves numerical robustness, and tightens CI/test reliability across Numerics.
Changes
double.NegativeInfinityinstead ofdouble.MinValuesentinel values in likelihood-related paths, with added edge-case coverage.Validation
dotnet build Test_Numerics\Test_Numerics.csproj -c Release --no-restore0 Warning(s), 0 Error(s)