-
Notifications
You must be signed in to change notification settings - Fork 929
Change Summary v1.2
-
CSI Driver for cStor Volumes (currently in Alpha) has added support for resizing and volume expansion feature. For further details, please refer to the documentation.
Note: The CSI Driver works with the new cStor Schema and also will require some of the alpha features on the K8s cluster to be enabled.
-
The new version of cStor Schema has been introduced to address the user feedback in terms of ease of use for cStor provisioning as well as to make a way in the schema to perform Day 2 Operations using GitOps. The public design proposal is available here. Note that existing StoragePoolClaim pools will continue to function as-is and users need not migrate yet to the new schema.
Note: We recommend users to try out the new schema on greenfield clusters to provide feedback.
The new cStor Pool provisioning introduces the following:
- Support for a new CRDs called
CStorPoolCluster (CSPC)
andCStorPoolInstance(CSPI)
which are enhanced versions of current SPC and CSP respectively. The CSPC are defined within a namespace. - Support for cStor pool expansion and deletion are supported via CSPC. An admin/user should make changes/specify intent to the CSPC YAML to carry out the operations and it will be validated by admission webhook server for CSPC create and update.
- In supporting this new schema, the cstor-operator functionality has been separated from the maya-apiserver and will be executed in its deployment.
- Support for a new CRDs called
-
Enhanced Jiva internal snapshot deletion when a number of internal snapshots are more than 10. The deletion happens automatically one by one. Note: Since snapshot deletion involves merging of data from its child to itself, so this may impact the ongoing IOs. https://github.com/openebs/jiva/pull/236
-
Enhanced velero-plugin to support backup/restore for OpenEBS installed in a different namespace other than
openebs
namespace. You can also optionally provideopenebs
installation namespace inVolumeSnapshotLocation
spec with parameterspec.config.namespace
. https://github.com/openebs/velero-plugin/pull/36 -
Enhanced NDM to include NodeAttributes like hostname, node name (name in the Kubernetes node resource) in BD and BDC. https://github.com/openebs/maya/pull/1408, https://github.com/openebs/node-disk-manager/pull/298, https://github.com/openebs/node-disk-manager/pull/296
-
Enhanced BlockDevice CRD by adding node name to the printer column. This feature will get the name of the node to which the BD is attached while performing
kubectl get bd -n <openebs_installed_namespace>
. https://github.com/openebs/node-disk-manager/pull/300 -
Added support for customization of default hostpath used in Jiva and Local PV default storage classes. This feature will allow users to customize the default path by passing in an ENV variable to the API Server. https://github.com/openebs/maya/pull/1433
The following ENV variables need to be added to maya-apiserver deployment spec.
OPENEBS_IO_JIVA_POOL_DIR OPENEBS_IO_LOCALPV_HOSTPATH_DIR
-
Enhanced the maya-apiserver installer to skip re-apply of default SPC and SC resources if they were already installed by older version(s) of maya or prior to maya-apiserver restart(s). https://github.com/openebs/maya/pull/1412
- Fixes a bug in cStor pool when cStor Storage Pool management creates pool if pool import failed when a disk is not accessible momentarily just at the time of import. cStor storage pool will be in the pending state when this scenario occurs. This PR will fix cStor pool creation by looking on
Status.Phase
asInit
orPoolCreationFailed
to create the pool. IfStatus.Phase
is any other string, cStor Storage Pool management will try to import the pool. This can cause impact to the current workflow of Ephemeral disks, which works as of now, as NDM can't detect it as different disk and recognizes as the previous disk. https://github.com/openebs/maya/pull/1401 - Fixes a bug in snapshot controller where snapshot operation is not throwing any error for invalid
cas-type
. This fix will addcas-type
validation before triggering the snapshot operations. The validcas-type
are cStor and Jiva. https://github.com/openebs/external-storage/pull/100 - Fixes the bug where more than required BlockDevicesClaims are created for requested SPC in auto pool method. https://github.com/openebs/maya/pull/1388
- Fixed an issue where Local PV failed to provision in clusters where
nodename
andhostname
are different. https://github.com/openebs/maya/pull/1414, https://github.com/openebs/maya/pull/1417 - Fixes an issue in Jiva where Jiva Replicas failed to schedule on Nodes where
hostname
is not the same asnodename
. The fix will set nodeSelector in deployment and clean-up job after converting the nodename into hostname. https://github.com/openebs/maya/pull/1392, https://github.com/openebs/maya/pull/1389 - Fixes a bug in NDM where all devices on a node were getting excluded when
os-disk-exclude-filter
failed to find the device where OS is installed. https://github.com/openebs/node-disk-manager/pull/304 - Fixes a bug during a cleanup operation performed on BlockDevice and clean up job is not getting canceled when the state of BlockDevice is changed from
Active
to other states. The fix will ensure the cleanup jobs are started only on Active BDs and the cleanup job will be canceled when the BlockDevice state has changed during the cleanup operation and BlockDevice state will remain unchanged.https://github.com/openebs/node-disk-manager/pull/295 - Fixes a bug in NDM where cleanup jobs remain in pending state in Openshift cluster. The fix will add service account to cleanup jobs, so that clean-up job pods acquire privileged access to perform the action.https://github.com/openebs/node-disk-manager/pull/293
From 1.0.0: None.
For releases prior to 1.0, please refer to the respective release notes and upgrade steps.
From 1.2, if the status of CSP is either in Init
or PoolCreationFailed
, cstor-pool-mgmt
container in pool pod will attempt to create the pool. So, when there is a need to re-create pool for the same CSP due to ephemeral storage, CSP CR related to this pool needs to be edited to set status.phase
as Init
. As part of reconciliation, cstor-pool-mgmt
container of pool pod will attempt to recreate the pool. Refer https://github.com/openebs/maya/pull/1401 for more details.
Note: As part of OpenEBS upgrade or installation, maya-apiserver
pod will restart if NDM blockdevice CRDs are not created before the creation of maya-apiserver. https://github.com/openebs/maya/pull/1381
The recommended steps to uninstall are:
- delete all the OpenEBS PVCs that were created
- delete all the SPCs (in case of cStor)
- ensure that no volume or pool pods are pending in terminating state
kubectl get pods -n <openebs namespace>
- ensure that no openebs cStor volume custom resources are present
kubectl get cvr -n <openebs namespace>
- delete all openebs related StorageClasses.
- delete the openebs either via
helm purge
orkubectl delete
Uninstalling OpenEBS doesn't automatically delete the CRDs that were created. If you would like to remove CRDs and the associated objects completely, run the following commands:
kubectl delete crd castemplates.openebs.io
kubectl delete crd cstorpools.openebs.io
kubectl delete crd cstorvolumereplicas.openebs.io
kubectl delete crd cstorvolumes.openebs.io
kubectl delete crd runtasks.openebs.io
kubectl delete crd upgradetasks.openebs.io
kubectl delete crd storagepoolclaims.openebs.io
kubectl delete crd storagepools.openebs.io
kubectl delete crd volumesnapshotdatas.volumesnapshot.external-storage.k8s.io
kubectl delete crd volumesnapshots.volumesnapshot.external-storage.k8s.io
kubectl delete crd disks.openebs.io
kubectl delete crd blockdevices.openebs.io
kubectl delete crd blockdeviceclaims.openebs.io
kubectl delete crd cstorbackups.openebs.io
kubectl delete crd cstorrestores.openebs.io
kubectl delete crd cstorcompletedbackups.openebs.io
kubectl delete crd cstorvolumeclaims.openebs.io
kubectl delete crd cstorpoolclusters.openebs.io
kubectl delete crd cstorpoolinstances.openebs.io
Note: As part of deleting the Jiva Volumes - OpenEBS launches scrub jobs for clearing data from the nodes. The completed jobs need to be cleared using the following command:
kubectl delete jobs -l openebs.io/cas-type=jiva -n <namespace>
- The current version of OpenEBS volumes are not optimized for performance-sensitive applications.
- cStor volume will be offline when
ReplicationFactor
is not greater than 50% and then cStor volume will not come online automatically and then reconstructing data to recovered replicas. It requires manual steps to make the volume online and reconstruct the data to the replaced replicas from a healthy replica once the cStor volume is online. - If a pending PVC related to
openebs-device
StorageClass is deleted, there are chances of getting stale BDCs which ends up in consuming BDs. You have to manually delete the BDC to reclaim it. - In OpenShift 3.10 or above, NDM daemon set pods and NDM operators will not be upgraded if NDM daemon set's DESIRED count is not equal to the CURRENT count. This may happen if nodeSelectors have been used to deploy OpenEBS related pods OR if master/other nodes have been tainted in the k8s cluster.
- Jiva Controller and Replica pods are stuck in
Terminating
state when any instability with the node or network happens and the only way to remove those containers is by usingdocker rm -f
on the node. https://github.com/openebs/openebs/issues/2675 - cStor Target or Pool pods can at times be stuck in a
Terminating
state. They will need to be manually cleaned up using kubectl delete with 0 sec grace period. Example:kubectl delete deploy -n openebs --force --grace-period=0
- cStor pool pods can consume more memory when there is continuous load. This can cross the memory limit and cause pod evictions. It is recommended that you create cStor pools by setting the Memory limits and requests.
- Jiva Volumes are not recommended if your use case requires snapshots and clone capabilities.
- Jiva Replicas use sparse file to store the data. When the application causes too many fragments (extents) to be created on the sparse file, the replica restart can cause replica to take longer time to get attached to the target. This issue was seen when there were 31K fragments created.
- Volume Snapshots are dependent on the functionality provided by Kubernetes. The support is currently alpha. The only operations supported are:
- Create Snapshot, Delete Snapshot and Clone from a Snapshot. Creation of the Snapshot uses a reconciliation loop, which would mean that a Create Snapshot operation will be retried on failure until the Snapshot has been successfully created. This may not be a desirable option in cases where Point in Time snapshots are expected.
- If you are using K8s version earlier than 1.12, in certain cases, it will be observed that when the node hosting the target pod is offline, the target pod can take more than 120 seconds to get rescheduled. This is because target pods are configured with Tolerations based on the Node Condition, and TaintNodesByCondition is available only from K8s 1.12. If running an earlier version, you may have to enable the alpha gate for TaintNodesByCondition. If there is an active load on the volume when the target pod goes offline, the volume will be marked as read-only.
- If you are using K8s version 1.13 or later, that includes the checks on ephemeral storage limits on the Pods, there is a chance that OpenEBS cStor and Jiva pods can get evicted - because there are no ephemeral requests specified. To avoid this issue, you can specify the ephemeral storage requests in the storage class or storage pool claim. (https://github.com/openebs/openebs/issues/2294)
- When the disks used by a cStor Pool are detached and reattached, the cStor Pool may miss detecting this event in certain scenarios. Manual intervention may be required to bring the cStor Pool online. (https://github.com/openebs/openebs/issues/2363)
- When the underlying disks used by cStor or Jiva volumes are under disk pressure due to heavy IO load, and if the Replicas take longer than 60 seconds to process the IO, the Volumes will get into Read-Only state. In 0.8.1, logs have been added to the cStor and Jiva replicas to indicate if IO has longer latency. (https://github.com/openebs/openebs/issues/2337)