You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Depends on: #742 (annotation pipeline migrated to the Allele model), #747 (existing MappedVariant data backfilled to MappingRecord + Allele)
Epic #742 moved the annotation jobs onto the new Allele-based parallel-tables model behind an explicit frozen old tables invariant: new score sets write only to the new tables, while the old tables are read by serving for existing data and never written for new data.
That invariant was always meant to be temporary. Once the new tables exist (all 5 steps of #742) and existing data has been migrated into them (#747), the old tables are dead weight: they confuse the data model, double the surface a reader has to reason about, and keep the serving layer reading from a representation we no longer write. This issue is the deliberate teardown.
Note: #747's Phase 1 describes "repoint the gnomad_variants/clinical_controls M2M associations to allele_id." The design diverged after #747 was written — the new model uses dedicated ValidTime link tables (gnomad_allele_links, clinvar_allele_links, vep_allele_consequences), not repointed M2M associations. The backfill must populate those link tables; this cleanup then drops the old associations. Worth reconciling #747's wording when it's picked up.
Goal
After backfill, retire the frozen old annotation tables/columns and cut the read path over to the new tables, so the new Allele-based representation is the single source of truth.
Scope
1. Backfill the new annotation link tables from old data
One-time, re-runnable migration populating the new tables from existing MappedVariant-era data (depends on #747 having created the Allele/MappingRecord rows to link to):
Verify Allele.hgvs_g/c/p and Allele.clingen_allele_id are populated for migrated alleles (these replace the retired HGVS and variant-translation jobs).
2. Read-cutover
Update v_variant_annotations (api/src/mavedb/models/variant_annotation_view.py) to project from Allele + the new link tables instead of MappedVariant.hgvs_g/c/p and the old annotation columns/associations.
Audit any other serving queries / endpoints that read the old annotation tables and repoint them.
3. Drop the old tables/columns (after backfill + cutover verified in prod)
Backfill migration populates gnomad_allele_links, clinvar_allele_links, and vep_allele_consequences from old data with no data loss; idempotent on re-run.
v_variant_annotations and all serving reads resolve annotation data through the new tables; no serving path reads a dropped table.
The frozen old tables/columns listed above are dropped in a migration with a tested downgrade().
Pipeline tracking / validation (mavedb.scripts.pipeline_tracking) still reports correctly against the new representation.
No reference to a dropped table/column/relationship remains in src/ (grep-clean).
Explicitly out of scope
The AnnotationEvent unified event log (api/docs/design/allele-annotation-status.md) — separate later epic; this issue keeps VariantAnnotationStatus as-is.
Context
Depends on: #742 (annotation pipeline migrated to the Allele model), #747 (existing
MappedVariantdata backfilled toMappingRecord+Allele)Epic #742 moved the annotation jobs onto the new Allele-based parallel-tables model behind an explicit frozen old tables invariant: new score sets write only to the new tables, while the old tables are read by serving for existing data and never written for new data.
That invariant was always meant to be temporary. Once the new tables exist (all 5 steps of #742) and existing data has been migrated into them (#747), the old tables are dead weight: they confuse the data model, double the surface a reader has to reason about, and keep the serving layer reading from a representation we no longer write. This issue is the deliberate teardown.
Goal
After backfill, retire the frozen old annotation tables/columns and cut the read path over to the new tables, so the new Allele-based representation is the single source of truth.
Scope
1. Backfill the new annotation link tables from old data
One-time, re-runnable migration populating the new tables from existing
MappedVariant-era data (depends on #747 having created theAllele/MappingRecordrows to link to):gnomad_allele_links←gnomad_variants_mapped_variantsclinvar_allele_links←mapped_variants_clinical_controlsvep_allele_consequences←MappedVariant.vep_functional_consequence/vep_access_dateAllele.hgvs_g/c/pandAllele.clingen_allele_idare populated for migrated alleles (these replace the retired HGVS and variant-translation jobs).2. Read-cutover
v_variant_annotations(api/src/mavedb/models/variant_annotation_view.py) to project fromAllele+ the new link tables instead ofMappedVariant.hgvs_g/c/pand the old annotation columns/associations.3. Drop the old tables/columns (after backfill + cutover verified in prod)
gnomad_variants_mapped_variants+GnomADVariant.mapped_variantsrelationshipmapped_variants_clinical_controls+MappedVariant.clinical_controlsrelationshipMappedVariant.vep_functional_consequence,MappedVariant.vep_access_datevariant_translations+api/src/mavedb/lib/variant_translations.py(confirm no remaining readers; the RT allele-equivalence space replaces it — see feat: Extend annotation pipeline to cover Allele entities #742 Step 5)clinical_controlstable (renamed toclinvar_controlsfor new code in feat: Extend annotation pipeline to cover Allele entities #742 Step 3; the old name stays only for serving until cutover)Acceptance Criteria
gnomad_allele_links,clinvar_allele_links, andvep_allele_consequencesfrom old data with no data loss; idempotent on re-run.v_variant_annotationsand all serving reads resolve annotation data through the new tables; no serving path reads a dropped table.downgrade().mavedb.scripts.pipeline_tracking) still reports correctly against the new representation.src/(grep-clean).Explicitly out of scope
AnnotationEventunified event log (api/docs/design/allele-annotation-status.md) — separate later epic; this issue keepsVariantAnnotationStatusas-is.mapped_variantsitself — tied to the broader mapping-data retirement under feat: Retroactive backfill — migrate MappedVariant data and run reverse translation for existing score sets #747, not the annotation cleanup.