Troubleshooting OpenEBS - cStor

General guidelines for troubleshooting

One of the cStorVolumeReplica(CVR) will have its status as Invalid after corresponding pool pod gets recreated

cStor volume become read only state

One of the cStorVolumeReplica(CVR) will have its status as Invalid after corresponding pool pod gets recreated

When User delete a cStor pool pod, there are high chances for that corresponding pool-related CVR’s can goes into Invalid state. Following is a sample output of kubectl get cvr -n openebs

NAME USED ALLOCATED STATUS AGE pvc-738f76c0-b553-11e9-858e-54e1ad4a9dd4-cstor-sparse-p8yp 6K 6K Invalid 6m

Troubleshooting

Sample logs of cstor-pool-mgmt when issue happens:

rm /usr/local/bin/zrepl exec /usr/local/bin/cstor-pool-mgmt start I0802 18:35:13.814623 6 common.go:205] CStorPool CRD found I0802 18:35:13.822382 6 common.go:223] CStorVolumeReplica CRD found I0802 18:35:13.824957 6 new_pool_controller.go:103] Setting up event handlers I0802 18:35:13.827058 6 new_pool_controller.go:105] Setting up event handlers for CSP I0802 18:35:13.829547 6 new_replica_controller.go:118] will set up informer event handlers for cvr I0802 18:35:13.830341 6 new_backup_controller.go:104] Setting up event handlers for backup I0802 18:35:13.837775 6 new_restore_controller.go:103] Setting up event handlers for restore I0802 18:35:13.845333 6 run_pool_controller.go:38] Starting CStorPool controller I0802 18:35:13.845388 6 run_pool_controller.go:41] Waiting for informer caches to sync I0802 18:35:13.847407 6 run_pool_controller.go:38] Starting CStorPool controller I0802 18:35:13.847458 6 run_pool_controller.go:41] Waiting for informer caches to sync I0802 18:35:13.856572 6 new_pool_controller.go:124] cStorPool Added event : cstor-sparse-p8yp, 48d3b2ba-b553-11e9-858e-54e1ad4a9dd4 I0802 18:35:13.857226 6 event.go:221] Event(v1.ObjectReference{Kind:”CStorPool”, Namespace:””, Name:”cstor-sparse-p8yp”, UID:”48d3b2ba-b553-11e9-858e-54e1ad4a9dd4”, APIVersion:”openebs.io/v1alpha1”, ResourceVersion:”1998”, FieldPath:””}): type: ‘Normal’ reason: ‘Synced’ Received Resource create event I0802 18:35:13.867953 6 common.go:262] CStorPool found I0802 18:35:13.868007 6 run_restore_controller.go:38] Starting CStorRestore controller I0802 18:35:13.868019 6 run_restore_controller.go:41] Waiting for informer caches to sync I0802 18:35:13.868022 6 run_replica_controller.go:39] Starting CStorVolumeReplica controller I0802 18:35:13.868061 6 run_replica_controller.go:42] Waiting for informer caches to sync I0802 18:35:13.868098 6 run_backup_controller.go:38] Starting CStorBackup controller I0802 18:35:13.868117 6 run_backup_controller.go:41] Waiting for informer caches to sync I0802 18:35:13.946730 6 run_pool_controller.go:45] Starting CStorPool workers I0802 18:35:13.946931 6 run_pool_controller.go:51] Started CStorPool workers I0802 18:35:13.968344 6 run_replica_controller.go:47] Starting CStorVolumeReplica workers I0802 18:35:13.968441 6 run_replica_controller.go:54] Started CStorVolumeReplica workers I0802 18:35:13.968490 6 run_restore_controller.go:46] Starting CStorRestore workers I0802 18:35:13.968538 6 run_restore_controller.go:53] Started CStorRestore workers I0802 18:35:13.968602 6 run_backup_controller.go:46] Starting CStorBackup workers I0802 18:35:13.968689 6 run_backup_controller.go:53] Started CStorBackup workers I0802 18:35:43.869876 6 handler.go:456] cStorPool pending: 48d3b2ba-b553-11e9-858e-54e1ad4a9dd4 I0802 18:35:43.869961 6 new_pool_controller.go:160] cStorPool Modify event : cstor-sparse-p8yp, 48d3b2ba-b553-11e9-858e-54e1ad4a9dd4 I0802 18:35:43.870552 6 event.go:221] Event(v1.ObjectReference{Kind:”CStorPool”, Namespace:””, Name:”cstor-sparse-p8yp”, UID:”48d3b2ba-b553-11e9-858e-54e1ad4a9dd4”, APIVersion:”openebs.io/v1alpha1”, ResourceVersion:”2070”, FieldPath:””}): type: ‘Normal’ reason: ‘Synced’ Received Resource modify event I0802 18:35:44.905633 6 pool.go:93] Import command successful with true dontimport: false importattr: [import -c /tmp/pool1.cache -o cachefile=/tmp/pool1.cache cstor-48d3b2ba- b553-11e9-858e-54e1ad4a9dd4] out:

From the above highlighted logs, we can confirm cstor-pool-mgmt in new pod is communicating with cstor-pool in old pod as first highlighted log says cstor pool found then next highlighted one says pool is really imported.

Possible Reason:

When a cstor pool pod is deleted there are high chances that two cstor pool pods of same pool can present i.e old pool pod will be in Terminating state(which means not all the containers completely terminated) and new pool pod will be in Running state(might be few containers are in running state but not all). In this scenario cstor-pool-mgmt container in new pool pod is communicating with cstor-pool in old pool pod. This can cause CVR resource to set to Invalid.

Note: This issue has observed in all OpenEBS versions upto 1.2.

Resolution:

Edit the Phase of cStorVolumeReplica (cvr) from Invalid to Offline. After few seconds CVR will be Healthy or Degraded state depends on rebuilding progress.

cStor volume become read only state

Application mount point running on cStor volume went into read only state.

Possible Reason:

If cStorVolume is Offline or corresponding target pod is unavailable for more than 120 seconds(iSCSI timeout) then the PV will be mounted as read-only filesystem. For understanding different states of cStor volume, more details can be found here.

Troubleshooting

Check the status of corresponding cStor volume using the following command:

  1. kubectl get cstorvolume -n <openebs_installed_namespace> -l openebs.io/persistent-volume=<PV_NAME>

If cStor volume exists in Healthy or Degraded state then restarting of the application pod alone will bring back cStor volume to RW mode. If cStor volume exists in Offline, reach out to OpenEBS Community for assistance.

See Also:

FAQs

Seek support or help

Latest release notes