CLVM enhancements and fixes#12617
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #12617 +/- ##
=============================================
- Coverage 17.90% 3.68% -14.23%
=============================================
Files 5938 454 -5484
Lines 532864 38798 -494066
Branches 65192 7151 -58041
=============================================
- Hits 95392 1428 -93964
+ Misses 426793 37183 -389610
+ Partials 10679 187 -10492
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5fc1f12 to
9e03f4b
Compare
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16801 |
| UserVmVO vm = userVmDao.findById(vmId); | ||
| String cantHandleLog = String.format("Default VM snapshot cannot handle VM snapshot for [%s]", vm); | ||
|
|
||
| if (isRunningVMVolumeOnCLVMStorage(vm, cantHandleLog)) { |
There was a problem hiding this comment.
@Pearl1594
what's the image format on CLVM ? RAW or QCOW2 ?
a08e7a5 to
df61d6f
Compare
df61d6f to
43e9384
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #12617 +/- ##
============================================
+ Coverage 18.75% 18.89% +0.13%
- Complexity 17966 18220 +254
============================================
Files 6160 6170 +10
Lines 552578 554920 +2342
Branches 67348 67736 +388
============================================
+ Hits 103660 104844 +1184
- Misses 437512 438555 +1043
- Partials 11406 11521 +115
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
3f900e8 to
c9dd7ed
Compare
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16875 |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16877 |
|
|
@blueorangutan test |
|
@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
There was a problem hiding this comment.
LGTM
QA Report: Modernised CLVM Support
Date: June 1, 2026
Tester: Rositsa Kyuchukova
Summary
| Result | Count |
|---|---|
| Pass | 95 |
| Blocked | 8 |
| Total | 103 |
Note on Blocked tests: All blocked test cases relate to incremental snapshot functionality (CLVM_NG dirty bitmaps). This feature has been moved to a separate PR and will be handled in a dedicated incremental snapshots PR.
Test Results
Storage Pool Management
| Test Case | Priority | Result |
|---|---|---|
| CLVM primary storage pool reaches Up state and shows correct type, IP and path | High | Pass |
| CLVM_NG primary storage pool reaches Up state | High | Pass |
| CLVM pool remains Up after agent restart | High | Pass |
| CLVM_NG pool remains Up after agent restart | Medium | Pass |
| CLVM_NG visible as selectable option in protocol list | High | Pass |
| VG Name field appears when CLVM or CLVM_NG is selected | Medium | Pass |
| VG Name field not shown for non-CLVM pool types | Medium | Pass |
| Pool creation rejected when VG does not exist on hosts | Medium | Pass |
| Pool creation rejected when VG name is empty | Medium | Pass |
| Duplicate CLVM/CLVM_NG pool creation with same VG name is rejected | High | Pass |
| CLVM pool capacity updates after volume creation | Medium | Pass |
| CLVM pool reported capacity matches underlying VG | High | Pass |
| CS does not overprovision CLVM/CLVM_NG volumes beyond VG capacity | Medium | Pass |
Volume Lifecycle
| Test Case | Priority | Result |
|---|---|---|
| Create CLVM data volume - LV provisioned with correct name and RAW format | High | Pass |
| CLVM volume size on LV matches the requested size | High | Pass |
| Create CLVM_NG data volume - LV provisioned and formatted as QCOW2 | High | Pass |
| CLVM_NG volume disk format reported as QCOW2 in CloudStack | High | Pass |
| CLVM_NG LV is larger than requested size to accommodate QCOW2 overhead | Medium | Pass |
| CLVM_NG LV size is rounded up to actual VG PE boundary | Medium | Pass |
| Data volume created while VM is running is provisioned on the VM's host | High | Pass |
| lvcreate completes without hanging when LV space has an existing DOS signature | High | Pass |
| Attach CLVM volume to running VM - lock transferred to VM host | High | Pass |
| Attach CLVM volume to stopped VM: lock acquired on correct host at VM start | High | Pass |
| Detach CLVM volume from running VM - lock released cleanly | High | Pass |
| Delete CLVM volume with zero-fill disabled - fast deletion with no zeroing | High | Pass |
| Delete CLVM volume with zero-fill enabled - blkdiscard issued before lvremove | High | Pass |
| Zero-fill falls back to dd when blkdiscard is not supported | Medium | Pass |
| CLVM_NG delete volume - zero-fill OFF | High | Pass |
| CLVM_NG delete volume - zero-fill ON | High | Pass |
Volume Resize
| Test Case | Priority | Result |
|---|---|---|
| CLVM volume resize while VM stopped | High | Pass |
| CLVM volume resize while VM running | High | Pass |
| Root volume resize on CLVM_NG | High | Pass |
| CLVM_NG volume resize while VM stopped | High | Pass |
| CLVM_NG volume resize while VM running | High | Pass |
| Shrink rejected on CLVM_NG volume | High | Pass |
| CLVM_NG resize - LV rounded to PE boundary | High | Pass |
| CLVM_NG resize - LV larger than requested size (overhead verified) | High | Pass |
Zero-Fill / Secure Erase
| Test Case | Priority | Result |
|---|---|---|
| Zero-fill config change has no effect until agent reconnect | Medium | Pass |
| Zero-fill enabled on one pool and disabled on another - pools behave independently | High | Pass |
VM Operations
| Test Case | Priority | Result |
|---|---|---|
| VM started on different host triggers lock transfer to new host | High | Pass |
| clvmLockHostId updated in volume_details after lock transfer on VM start | High | Pass |
| Expunge VM with zero-fill enabled - root volume zeroed and LV removed; data volumes detached only | High | Pass |
| Expunge VM with zero-fill disabled - root volume LV removed quickly with no zeroing; data volume detached only | High | Pass |
| Expunge VM with CLVM_NG volumes and zero-fill enabled - LVs zeroed and removed | High | Pass |
| Expunge VM with CLVM_NG volumes and zero-fill disabled - LVs removed and metadata cleaned | High | Pass |
Live Migration
| Test Case | Priority | Result |
|---|---|---|
| Live migrate VM from CLVM to NFS - VM accessible after migration | High | Pass |
| Live migrate VM from NFS to CLVM - VM accessible after migration | High | Pass |
| Live migrate VM from CLVM to CLVM_NG - volume converted to QCOW2 | High | Pass |
| Live migrate VM from CLVM_NG to CLVM - volume converted to RAW | High | Pass |
| Live migrate VM from CLVM_NG to NFS - QCOW2 format preserved on destination | High | Pass |
| Live migrate VM from NFS to CLVM_NG - QCOW2-backed LV created correctly | High | Pass |
| Pre-migration command transitions CLVM volume lock from exclusive to shared on source | High | Pass |
| Post-migration command transitions CLVM volume lock to exclusive on destination | High | Pass |
| clvmLockHostId updated to destination host after successful migration | High | Pass |
| Migration failure - lock reverted to exclusive on source host | High | Pass |
| VM remains accessible on source host after migration failure | High | Pass |
Lock Management and Fan-out
| Test Case | Priority | Result |
|---|---|---|
| Volume operation succeeds when clvmLockHostId points to wrong host - fan-out fallback corrects DB | High | Pass |
| Volume operation succeeds when clvmLockHostId host is down - fan-out finds actual lock holder | High | Pass |
| CLVM lightweight migration volume attach | High | Pass |
Snapshots
| Test Case | Priority | Result |
|---|---|---|
| CLVM snapshot uploaded to secondary storage after creation | High | Pass |
| Snapshot LV removed from primary storage after successful backup | High | Pass |
| Snapshot of attached CLVM volume on running VM dispatched to lock host | High | Pass |
| Snapshot of detached CLVM volume dispatched to correct host | High | Pass |
| Snapshot of unattached volume created from snapshot completes without host not found error | High | Pass |
| No orphaned LVs remain on VG after snapshot deletion | High | Blocked |
| CLVM_NG snapshot is full and initialises dirty bitmap | High | Pass |
| Second CLVM_NG snapshot is incremental and smaller than full | High | Pass |
| Incremental CLVM_NG snapshot can be extracted - converted to full snapshot for download | High | Pass |
| Disabling kvm.incremental.snapshot causes next CLVM_NG snapshot to be full | Medium | Pass |
| Disabling kvm.incremental.snapshot causes CLVM_NG to fall back to full snapshots | Medium | Blocked |
| Re-enabling kvm.incremental.snapshot - first snapshot is full then incremental resumes | Medium | Blocked |
| CLVM_NG: Dirty bitmap removed from LV metadata after live migration of VM to another host | High | Blocked |
| CLVM_NG: First snapshot after migration is full; subsequent snapshots resume as incremental | High | Blocked |
| Incremental snapshot of stopped CLVM_NG VM | High | Blocked |
| VM snapshot on CLVM volume rejected | High | Blocked |
| VM snapshot on CLVM_NG volume rejected with expected error message | High | Blocked |
| CLVM_NG snapshot artifact cleaned after backup | High | Pass |
| No orphaned artifacts after CLVM_NG snapshot deletion | High | Pass |
Snapshot - Revert and Create From
| Test Case | Priority | Result |
|---|---|---|
| Revert CLVM volume to snapshot - data reflects snapshot point in time | High | Pass |
| VM resumes cleanly after CLVM volume revert | High | Pass |
| Volume created from CLVM snapshot contains correct data | High | Pass |
| Volume created from CLVM snapshot has clvmLockHostId initialised | High | Pass |
| Volume created from CLVM snapshot has clvmLockHostId set in volume_details | High | Pass |
| Template created from CLVM snapshot is deployable | High | Pass |
| Revert CLVM_NG volume to snapshot | High | Pass |
| VM resumes cleanly after CLVM_NG root volume revert | High | Pass |
| Volume from CLVM_NG snapshot - correct data | High | Pass |
| Volume from CLVM_NG snapshot - clvmLockHostId set | High | Pass |
| Template from CLVM_NG snapshot is deployable | High | Pass |
CLVM_NG Templates and Backing Files
| Test Case | Priority | Result |
|---|---|---|
| VM deployed from CLVM_NG template uses template LV as QCOW2 backing file | High | Pass |
| Template LV on CLVM_NG pool accessible on multiple hosts simultaneously - exclusive activation blocked by sanlock | High | Pass |
| RAW template on CLVM_NG creates QCOW2 disk (regression) | High | Pass |
API
| Test Case | Priority | Result |
|---|---|---|
| createStoragePool API with CLVM url scheme creates pool in Up state | High | Pass |
| createStoragePool API with CLVM_NG url scheme creates pool in Up state | High | Pass |
| listStoragePools API returns correct type field for CLVM and CLVM_NG pools | Medium | Pass |
| createVolume API on CLVM pool sets clvmLockHostId in volume_details | Medium | Pass |
| attachVolume and detachVolume API on CLVM volume triggers lock transfer | High | Pass |
| createSnapshot API on CLVM volume dispatches to lock host | High | Pass |
| Management server log confirms lightweight migration path taken - no data copy | High | Pass |
PR Quality Checks
| Test Case | Priority | Result |
|---|---|---|
| Code coverage criteria | Medium | Pass |
| CloudStack Style | Medium | Pass |
| Documentation | Medium | Pass |
| PR Mergable | Medium | Pass |
|
[SF] Trillian test result (tid-16254)
|
|
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch. |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18195 |
|
@blueorangutan test |
|
@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
| // For CLVM volumes, route to the host holding the exclusive lock | ||
| if (volume.getHypervisorType() == Hypervisor.HypervisorType.KVM) { | ||
| DataStore store = volume.getDataStore(); | ||
| if (store.getRole() == DataStoreRole.Primary) { | ||
| StoragePoolVO pool = _storagePoolDao.findById(store.getId()); | ||
| if (pool != null && ClvmPoolManager.isClvmPoolType(pool.getPoolType())) { | ||
| Long lockHostId = getClvmLockHostId(volume); | ||
| if (lockHostId != null) { | ||
| logger.info("Routing CLVM volume {} deletion to lock holder host {}", | ||
| volume.getUuid(), lockHostId); | ||
| EndPoint ep = getEndPointFromHostId(lockHostId); | ||
| if (ep != null) { | ||
| return ep; | ||
| } | ||
| logger.warn("Could not get endpoint for CLVM lock host {}, falling back to default selection", | ||
| lockHostId); | ||
| } else { | ||
| logger.debug("No CLVM lock host tracked for volume {}, using default endpoint selection", | ||
| volume.getUuid()); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
A similar logic is executed many times throughout this file. We could use a single method instead.
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18198 |
|
@blueorangutan test |
|
@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|


Description
This PR enhances the existing CLVM implementation which was based on the deprecated CLVM technology which was based on corosync/pacemaker. With RHEL 7 having reached EOL, CLVM seems to be broken. CLVM supports RAW volumes on LVM , where as CLVM_NG support QCOW2 on LVM.
Further details: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Modernized+CLVM%3A+Enhancements+and+CLVM_NG+support
NOTE: On testing - it was identified that incremental snapshots for clvm-ng do not work as expected. As of now it's been removed from scope. So, CLVM and CLVM_NG would only support full snapshots.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?