Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Datanode support to manage both SSD and HDD disks #3239

Open
1 task done
OpenPie-DTXLab opened this issue Mar 12, 2024 · 4 comments
Open
1 task done

[Feature]: Datanode support to manage both SSD and HDD disks #3239

OpenPie-DTXLab opened this issue Mar 12, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@OpenPie-DTXLab
Copy link

Contact Details

[email protected]

Is there an existing issue for this?

  • I have searched all the existing issues

Is your feature request related to a problem? Please describe.

In our use cases, there are only few machines available to deploy cfs cluster , eg 3 or 5. Those machines are both equipped with SSD and HDD disks. For some cold data, we want to migrate them from SSD to HDD.
As I know, the developing Hybrid-cloud branch support data cool-down feature, but need to deploy at least two zones , in each zone the nodes are limited to configure same type of disks. when migrating, the cold data are transfered from SSD zone to HDD zone.
The solution now is not friendly for small cfs clusters, eg. nodes less than 6 . Also, for machines with both SSD and HDD disks, the migration process can not leverage Locality to imporve performance.

Describe the solution you'd like.

As a solutiion, I think the datanode should have the ability to manage both SSD and HDD which seems more reasonable. As a result, even the cluster with 3 nodes (with both SSD and HDD, one zone) can experience the data cool-down feature, Meanwhile, cubefs can optimize cold data migration performance. When migrating cold data, prefer to choose HDD directories that on the same node as the destination, which can reduce network traffic and improve migration performance greatly.

Describe an alternate solution.

As a alternate solution, we can deploy cfs in containers, so we can deploy two zones on three nodes cluster. Nodes in each zone manage a single type of disks. Nodes configure secondary IP also can help.
But can't use locality to reduce network traffics caused by data migration

Anything else? (Additional Context)

No response

@OpenPie-DTXLab OpenPie-DTXLab added the enhancement New feature or request label Mar 12, 2024
Copy link

@OpenPie-DTXLab
Copy link
Author

OpenPie-DTXLab commented Mar 12, 2024

We have a rough idea :

  1. a datanode can configrue different type of disks, so zone also can have multiple media types.

  2. when create volume , if specified two kind of storage class, prefer to create paired SSD and HDD data parittion on the same data node .

  3. when lcnode trigger migration task. First try to transite cold data from SSD srcDp to HDD dstDp which located on the same data node with SSD srcDp.
    a. get extents list of the migrating file
    b. group the extents by data partition (srcDp)
    c. for each srcDp , select a dstDp which is co-located to srcDp .then build a extentsLocalTransition request which contains the local transition context and will be sent to datanode
    d. send request to target datanode on which first dp replica located (repl-protocol)
    e. all datanodes with dp replicas perform local transition , read from srcDp extents and write to extents on dstDp . And first replica datanode return the inode migrated extents list to lcNode
    g. lcNode batch update inode metadata of the migrating file

    image

  4. If local transition failed , then use the original workflow to transite data across nodes

@OpenPie-DTXLab
Copy link
Author

we have roughly implement a draft version , and verify the idea. #3243

The local transition process described above is faster than the original and even more faster than file upload process. In my environment(4c,8G ,1000M netwrok),a 4GB file local migration takes about more than 10 seconds(even faster than s3 put which takes about 30 seconds) , and the original cross-node migration process takes about 90 seconds.

@true1064
Copy link
Contributor

This solution requires significant changes, and we can only consider whether to incorporate this approach after the completion of the first phase of the HybridCloud project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants