Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: 2024 plan #3064

Open
1 task done
leonrayang opened this issue Feb 1, 2024 · 5 comments
Open
1 task done

[Feature]: 2024 plan #3064

leonrayang opened this issue Feb 1, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@leonrayang
Copy link
Member

leonrayang commented Feb 1, 2024

Contact Details

No response

Is there an existing issue for this?

  • I have searched all the existing issues

Is your feature request related to a problem? Please describe.

2024 plan

Describe the solution you'd like.

Feature Type Version Status Branch Release Date
Automatic migration Stability Release-3.4.0 QA Testing develop-v3.4.0 JUNE
Snapshot Feature Release-3.4.0 QA Testing develop-v3.4.0 JUNE
Hybrid Cloud automatic
data hierarchy
cost
optimization
Release-3.5.0 QA Testing develop-hybridcloudlifecycle JULY
Distributed Cache Feature Release-3.6.0 SELF-Testing/Unit
Test
flash_cache AUG
Metanode
persist with rocksdb
cost
optimization
Release-3.6.0 SELF-Testing/Unit
Test
metanode_rocksdb_dev AUG
RDMA Performance Release-3.6.0 SELF-Testing/Unit
Test
cubefs-rdma AUG
Kernel FileSystem Client
And GPU Direct Storage
Performance Release-3.7.0 SELF-Testing/Unit
Test
cubefs-kernel-rdma OCT
Call Chain Feature Release-3.7.0 SELF-Testing/Unit
Test
blobstore-tracelog OCT

Architecture refactoringhigh priority

  1. The storage engine is reconstructed to provide an append only file system, with lower latency and higher throughput for data reading and writing.
  2. Hybrid cloud: Hybrid cloud projects support a unified namespace, provide the ability to use multiple storage systems in a mixed manner, and provide external S3 and HDFS capabilities. Support life cycle driven data flow between different media, storage types, and on and off the cloud, reducing costs and increasing efficiency. The first issue will be released soon.

Improved stability and reliability

  1. Disk CRC enhancement to improve CRC checking capabilities such as master-slave synchronization and random writing.
  2. Automatic disk migration reduces the atomicity problem of metadata information during the migration process and improves the level of operational automation.
  3. System module operation monitoring and alarms are strengthened to enhance observability.
  4. The data node adds learner capabilities and supports multi-active deployment in the same city.

Performance improvements

  1. Full-link acceleration to better support scenarios such as database calculation separation and AI training acceleration.
  2. Client: Provides a kernel client and supports GDS (GPU Direct Storage) and RDMA technology to reduce IO latency and CPU overhead.
  3. Server: Rebuild the communication mechanism based on RDMA, thereby reducing the overall latency of read and write services and improving throughput.
  4. Distributed cache: further optimize the distributed multi-level cache architecture to support cross-computer room and cross-cloud read and write acceleration capabilities to support AI training acceleration needs.
  5. Optimize the reading and writing capabilities of existing systems based on TCP links.
  6. Optimize client local cache (level one cache) performance

characteristic

  1. Metadata storage is implemented based on RocksDB, and the full metadata cache is optimized to on-demand caching to reduce memory overhead.
  2. The erasure coding subsystem removes Kafka component dependencies and provides SDK for direct client access, shortening the data transmission path.
  3. Provides event notification features, S3api QoS, objnode audit log function, cross-region replication, QPS and bandwidth metering and billing capabilities;

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

@leonrayang leonrayang added the enhancement New feature or request label Feb 1, 2024
@leonrayang leonrayang self-assigned this Feb 1, 2024
@xiaochunhe
Copy link
Contributor

It is recommended to update this to roadmap.md with a more concise description

@bladehliu
Copy link
Member

bladehliu commented Feb 2, 2024

The top2 things in 2024:

  • Move CubeFS forward to be more cloud-native - it can manage multiple data sources - its own private cloud storage and/or S3-like public cloud storage
  • Run more analytical/search databases on top of CubeFS to enable separation of storage/compute

@sejust sejust pinned this issue Feb 2, 2024
@bladehliu
Copy link
Member

bladehliu commented Feb 3, 2024

Let's make CubeFS the best solution for separation of storage and computing.

  • WAL optimization as high priority

@guohao-rosicky
Copy link

Hi, @leonrayang thanks work on this. I'm interested in these two features, is there a design document for them?

Hybrid cloud: Hybrid cloud projects support a unified namespace, provide the ability to use multiple storage systems in a mixed manner, and provide external S3 and HDFS capabilities. Support life cycle driven data flow between different media, storage types, and on and off the cloud, reducing costs and increasing efficiency. The first issue will be released soon.

Distributed cache: further optimize the distributed multi-level cache architecture to support cross-computer room and cross-cloud read and write acceleration capabilities to support AI training acceleration needs.

@tengallonhead-lv
Copy link

Let's make CubeFS the best solution for separation of storage and computing.

  • WAL optimization as high priority

Hi,@bladehliu thanks work on this.I'm insterested in this work, is there a design document for them?

@leonrayang leonrayang unpinned this issue Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants