Cloning a data volume using smart-cloning

Cloning a data volume using smart-cloning

Smart-cloning is a built-in feature of Red Hat OpenShift Data Foundation. Smart-cloning is faster and more efficient than host-assisted cloning.

You do not need to perform any action to enable smart-cloning, but you need to ensure your storage environment is compatible with smart-cloning to use this feature.

When you create a data volume with a persistent volume claim (PVC) source, you automatically initiate the cloning process. You always receive a clone of the data volume if your environment supports smart-cloning or not. However, you will only receive the performance benefits of smart cloning if your storage provider supports smart-cloning.

About data volumes

DataVolume objects are custom resources that are provided by the Containerized Data Importer (CDI) project. Data volumes orchestrate import, clone, and upload operations that are associated with an underlying persistent volume claim (PVC). You can create a data volume as either a standalone resource or by using the dataVolumeTemplate field in the virtual machine (VM) specification.

VM disk PVCs that are prepared by using standalone data volumes maintain an independent lifecycle from the VM. If you use the dataVolumeTemplate field in the VM specification to prepare the PVC, the PVC shares the same lifecycle as the VM.

After a PVC is populated, the data volume that you used to create the PVC is no longer needed. OKD Virtualization enables automatic garbage collection of completed data volumes by default. Standalone data volumes, and data volumes created by using the dataVolumeTemplate resource, are automatically garbage collected after completion.

About smart-cloning

When a data volume is smart-cloned, the following occurs:

A snapshot of the source persistent volume claim (PVC) is created.
A PVC is created from the snapshot.
The snapshot is deleted.

Cloning a data volume

Prerequisites

For smart-cloning to occur, the following conditions are required:

Your storage provider must support snapshots.
The source and target PVCs must be defined to the same storage class.
The source and target PVCs share the same volumeMode.
The VolumeSnapshotClass object must reference the storage class defined to both the source and target PVCs.

Procedure

To initiate cloning of a data volume:

Create a YAML file for a DataVolume object that specifies the name of the new data volume and the name and namespace of the source PVC. In this example, because you specify the storage API, there is no need to specify accessModes or volumeMode. The optimal values will be calculated for you automatically.
```
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: <cloner-datavolume> (1)
spec:
  source:
    pvc:
      namespace: "<source-namespace>" (2)
      name: "<my-favorite-vm-disk>" (3)
  storage: (4)
    resources:
      requests:
        storage: <2Gi> (5)
```
1 The name of the new data volume.
2 The namespace where the source PVC exists.
3 The name of the source PVC.
4 Specifies allocation with the storage API
5 The size of the new data volume.
Start cloning the PVC by creating the data volume:
```
$ oc create -f <cloner-datavolume>.yaml
```
Data volumes prevent a virtual machine from starting before the PVC is prepared, so you can create a virtual machine that references the new data volume while the PVC clones.

Additional resources

Cloning the persistent volume claim of a virtual machine disk into a new data volume
Configure preallocation mode to improve write performance for data volume operations.
Customizing the storage profile

1	The name of the new data volume.
2	The namespace where the source PVC exists.
3	The name of the source PVC.
4	Specifies allocation with the storage API
5	The size of the new data volume.