✨ catalogd graphql shift to file-based cache#2732
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR shifts catalogd’s GraphQL implementation from building/caching schemas off in-memory catalog metas to a file-based approach: schema metadata is discovered during Store() and persisted to disk, while query execution loads requested objects from catalog.jsonl on-demand using byte offsets from index.json. It also introduces a query complexity validation helper and updates handlers/tests to use the new GraphQL service interface.
Changes:
- Persist GraphQL schema metadata to
graphql-schema.jsonduring catalog storage and load it from disk for schema building. - Switch query execution to disk-backed object loading using index-based byte offsets (reducing RSS growth from caching parsed objects).
- Add a GraphQL query complexity validation helper and expand concurrency/singleflight tests around cache misses/builds.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/catalogd/storage/localdir.go | Writes graphql-schema.json on store, adds disk-backed object loader, and adjusts GraphQL pre-warm behavior. |
| internal/catalogd/storage/index.go | Exposes schema-section byte ranges for disk-backed GraphQL pagination. |
| internal/catalogd/service/graphql_service.go | Refactors service to use a storage-provided data provider and adds query validation/timeout. |
| internal/catalogd/service/graphql_service_test.go | Updates tests to use the new provider-based GraphQL service and adds singleflight coverage via provider counting. |
| internal/catalogd/server/handlers.go | Updates GraphQL handler to execute queries without needing a catalog FS. |
| internal/catalogd/server/handlers_test.go | Updates mocks/tests for the new GraphQL service interface and error behavior. |
| internal/catalogd/graphql/validation.go | Adds AST-based query complexity validation (depth/aliases/fields). |
| internal/catalogd/graphql/graphql.go | Adds schema serialization/deserialization and switches to loader-based query-time object retrieval. |
| hack/demo/graphql-demo-server/main.go | Updates demo server to build schema once and serve GraphQL directly from the generated schema. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2732 +/- ##
==========================================
+ Coverage 66.95% 67.96% +1.01%
==========================================
Files 149 150 +1
Lines 11341 11602 +261
==========================================
+ Hits 7593 7885 +292
+ Misses 3191 3144 -47
- Partials 557 573 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
ec01fe5 to
374e69e
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
f704ad1 to
1abb2b6
Compare
Signed-off-by: grokspawn <jordan@nimblewidget.com>
1abb2b6 to
04a2bc0
Compare
| for _, sec := range sections { | ||
| buf := make([]byte, sec.Length) | ||
| if _, err := f.ReadAt(buf, sec.Offset); err != nil { | ||
| return nil, fmt.Errorf("error reading section at offset %d: %w", sec.Offset, err) | ||
| } | ||
|
|
||
| var obj map[string]interface{} | ||
| if err := json.Unmarshal(buf, &obj); err != nil { | ||
| continue | ||
| } | ||
| results = append(results, obj) | ||
| } |
| return func(schemaName string, offset, limit int) ([]map[string]interface{}, error) { | ||
| sections := idx.GetSchemaSections(schemaName) | ||
| if sections == nil { | ||
| return nil, nil | ||
| } | ||
|
|
||
| if offset >= len(sections) { | ||
| return nil, nil | ||
| } | ||
| sections = sections[offset:] | ||
|
|
||
| if limit < len(sections) { | ||
| sections = sections[:limit] | ||
| } | ||
|
|
||
| f, err := os.Open(catalogPath) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error opening catalog file: %w", err) | ||
| } | ||
| defer f.Close() |
| case *ast.InlineFragment: | ||
| if err := c.walkSelectionSet(s.SelectionSet, depth+1); err != nil { | ||
| return err | ||
| } | ||
| case *ast.FragmentSpread: | ||
| name := s.Name.Value | ||
| if c.visited[name] { | ||
| continue | ||
| } | ||
| c.visited[name] = true | ||
| if frag, ok := c.fragments[name]; ok { | ||
| if err := c.walkSelectionSet(frag.SelectionSet, depth+1); err != nil { | ||
| return err | ||
| } | ||
| } |
Description
This PR proposes a shift in the graphql service endpoint from in-memory to on-disk caching which leverages common architectural conventions like fan-out catalog index generation (graphql-schema.json) which is then accessed to fulfill queries to eliminate RSS increases from the feature initial implementation.
Included is a structured graphql validation package as a focus for future limits enforcement and adds several cold-cache/cache-miss concurrency tests.
Reviewer Checklist