Skip to content

Adding copy_store convenience method for Group#3612

Open
melonora wants to merge 36 commits into
zarr-developers:mainfrom
melonora:copy_store
Open

Adding copy_store convenience method for Group#3612
melonora wants to merge 36 commits into
zarr-developers:mainfrom
melonora:copy_store

Conversation

@melonora

@melonora melonora commented Dec 3, 2025

Copy link
Copy Markdown
Contributor

[Description of PR]
This PR adds a convenience method for copying the store of a Group to a destination store.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.md
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@d-v-b

@github-actions github-actions Bot added the needs release notes Automatically applied to PRs which haven't added release notes label Dec 3, 2025
@codecov

codecov Bot commented Dec 3, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.73684% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.52%. Comparing base (be62f5a) to head (41e3e64).

Files with missing lines Patch % Lines
src/zarr/core/group.py 94.73% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3612   +/-   ##
=======================================
  Coverage   93.52%   93.52%           
=======================================
  Files          90       90           
  Lines       11926    11962   +36     
=======================================
+ Hits        11154    11188   +34     
- Misses        772      774    +2     
Files with missing lines Coverage Δ
src/zarr/testing/strategies.py 94.73% <ø> (ø)
src/zarr/core/group.py 95.17% <94.73%> (-0.03%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@melonora

melonora commented Dec 3, 2025

Copy link
Copy Markdown
Contributor Author

@d-v-b I know right now this won't work when zarr v2 is used. I can add that functionality as well or we say the convenience function is not supported for zarr v2.

@melonora

melonora commented Dec 3, 2025

Copy link
Copy Markdown
Contributor Author

added zarr v2 support. If in general the code is approved, I will add documentation.

Comment thread src/zarr/core/group.py Outdated
Comment thread src/zarr/core/group.py Outdated
Comment thread src/zarr/core/group.py Outdated
store: StoreLike,
*,
overwrite: bool = False,
consolidate_metadata: bool | None = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this here? IMO it's simpler if copying is just copying. If people want to add consolidated metadata, they should do that after copying.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjusted it. It now consolidates if the source store was consolidated, but this does not require an argument. One more thing, I only take into account when consolidation happened at the root level. It could be (although the user should not do this) that consolidation happened at the subgroup level. This could be accounted for by checking if member. metadata.consolidated_metadata when looping through members. WDYT should I add this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would copy the original group and its children exactly, including their consolidated metadata.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the purpose of copying over stale consolidated metadata?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it keeps the scope of this function simple -- it just copies the original group exactly. If this tool produces a different group, then it's not really copying.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the from_store method, I would open a separate PR for this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consolidated metadata is already in the group metadata document for zarr v3 if "kind" is set to "inline".

👍 but doing something other (more or less) than just copying would require interpreting that which is too far IMO.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @joshmoore the method under discussion here is a method on the Group class, not on stores. maybe this is a point of confusion.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the only consolidated metadata-related parameter we need is whether iterating over the child groups uses their consolidated metadata (if present). See the relevant kwarg in the signature of members.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added both, only thing remaining is probably adding consolidated_metadata in the async group call.

Comment thread src/zarr/core/group.py Outdated
@github-actions github-actions Bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Dec 5, 2025
@melonora

melonora commented Dec 5, 2025

Copy link
Copy Markdown
Contributor Author

@d-v-b @joshmoore I think I have now addressed all comments and I also added documentation, so I think this would be ready for a final round of review. @d-v-b thanks for your patience.

@melonora

Copy link
Copy Markdown
Contributor Author

@d-v-b @joshmoore , just bumping in case you have some time to see whether you approve the current state:)

Comment thread src/zarr/core/group.py Outdated
Comment thread src/zarr/core/group.py
@melonora

Copy link
Copy Markdown
Contributor Author

@d-v-b When the last test passes and if you are happy with the code changes I think this is good to go. Thanks for reviewing!

@d-v-b

d-v-b commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

@melonora I just realized that this implementation is decoding and re-encoding array chunks. Was there a particular reason for that choice? Because it would be far faster to copy the bytes directly

@melonora

Copy link
Copy Markdown
Contributor Author

Hmm, mostly a lack of familiarity with getting the raw bytes, but looking at the documentation, I should be able to do something like store.get to get them, correct?
I will change it

@d-v-b

d-v-b commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

you don't have to! I already did in a local branch. If it's OK with you, I would push those changes here.

@melonora

Copy link
Copy Markdown
Contributor Author

Ah ok, thanks! Yes I am ok with it.

@melonora

melonora commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

@d-v-b is this still of interest? I could resolve the conflicts or otherwise close it.

@d-v-b

d-v-b commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

it is, sorry for the delay in dealing with things here. lmk if you run into any trouble with the merge conflicts

@melonora

Copy link
Copy Markdown
Contributor Author

will do, thanks!

@melonora

Copy link
Copy Markdown
Contributor Author

@d-v-b if you are ok with the current state, it can be merged

@d-v-b

d-v-b commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@melonora I'm going to give this a careful look today. I might identify some things in this PR that would be better off implemented in separate parts of the zarr codebase, do you mind if I push to your branch? some changes would come in that way, others might be spun off into separate PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants