CuckooFS

CuckooFS is a high-performance distributed file system (DFS) designed for AI workloads. It addresses the challenges of handling a huge number of small files in AI workloads through a high-performance distributed metadata engine. CuckooFS aims to provide extremely high I/O performance by leveraging near-compute DRAM and SSDs and to offer elasticity and cost efficiency by integrating remote cloud object store, making it an ideal solution for modern AI applications. CuckooFS has been deployed in Huawei AI clusters with near 10,000 NPUs to accelerate data access during training data producing and model training for the Huawei Qiankun advanced driving solution (ADS).

Documents

Performance

Test Environment Configuration:

CPU: 2 x Intel Xeon Gold 5317 3.00 GHz, 12 cores
Memory: 16 x DDR4 2933 MHz 16GB
Storage: 2 x Samsung PM9A3 NVMe SSD 960 GB
Network: 2 x ConnectX-5 Single-Port 100GbE
OS: Ubuntu 20.04 Server 64-bit

Note This experiment uses an optimized Linux fuse module. The relevant code will be open-sourced later.

We conduct the experiments in a cluster of 13 dual-socket machines, whose configuration is shown above. To better simulate large scale deployment in data centers, we have the following setups:

First, to expand the test scale, we abstract each machine into two nodes, with each node bound to one socket, one SSD, and one NIC, scaling up the testbed to 26 nodes.
Second, to simulate the resource ratio in real deployment, we reduce the server resources to 4 cores per node. So that we can:
- generate sufficient load to stress the servers with a few client nodes.
- correctly simulate the 4:1 ratio between CPU cores and NVMe SSDs in typical real deployments. In the experiments below, we run 4 metadata nodes and 12 data nodes for each DFS instance and saturate them with 10 client nodes. All DFSs do not enable metadata or data replication.

Compared Systems:

CephFS 12.2.13.
JuiceFS 1.2.1, with TiKV 1.16.1 as the metadata engine and data store.
Lustre 2.15.6.

Throughput of File data IO.
We evaluate the performance of accessing small files with different file sizes. As shown in following figures, Y-axis is the throughput normalized to that of CuckooFS. Thanks to CuckooFS's higher metadata performance, it outperforms other DFSs in small file access. For files no larger than 64 KB, CuckooFS achieves 7.35--21.23x speedup over CephFS, 0.86--24.87x speedup over JuiceFS and 1.12--1.85x speedup over Lustre. For files whose size is larger than 256 KiB, the performance of CuckooFS is bounded by the aggregated SSD bandwidth.

MLPerf ResNet-50 Training Storage Benchmark.
We simulate training ResNet-50 model on a dataset containing 10 million files, each file contains one 131 KB object, which is a typical scenario for deep learning model training in production. MLPerf has been modified to avoid merging small files into large ones, simulating real-world business scenarios while reducing the overhead associated with merge and copy operations. The CuckooFS client utilizes an optimized FUSE module to minimize overhead, and the module will be open-sourced in the near future. Taking 90% accelerator utilization as the threshold, CuckooFS supports up to 80 accelerators while Lustre can only support 32 accelerators on the experiment hardware.

Build

suppose at the ~/code dir

git clone https://github.com/hw-fsi/cuckoofs.git
cd cuckoofs
git submodule update --init --recursive # submodule update postresql
./patches/apply.sh
docker run -it --rm -v `pwd`/..:/root/code -w /root/code/cuckoofs ghcr.io/hw-fsi/cuckoofs-dev /bin/zsh
./build.sh
ln -s /root/code/cuckoofs/cuckoo/build/compile_commands.json . # use for clangd

test

./build.sh test

clean

cd cuckoofs
./build.sh clean

incermental build and clean

cd cuckoofs
./build.sh build pg # only build pg
./build.sh clean pg # only clean pg
./build.sh build cuckoo # only build cuckoofs
./build.sh clean cuckoo # only clean cuckoofs
./build.sh build cuckoo --debug # build cuckoofs with debug

Authors

Junbin Kang
Lu Zhang
Mingyu Liu
Shaohong Guo
Ziyan Qiu
Anqi Yu
Jingwei Xu

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
cloud_native/cuckoo_cm		cloud_native/cuckoo_cm
cmake		cmake
common/src		common/src
config		config
cuckoo		cuckoo
cuckoo_client		cuckoo_client
cuckoo_store		cuckoo_store
deploy		deploy
docs		docs
licenses		licenses
log		log
patches		patches
remote_connection_def		remote_connection_def
tests		tests
third_party		third_party
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build.sh		build.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CuckooFS

Documents

Performance

Build

Authors

Copyright

About

Releases

Packages

Contributors 5

Languages

License

hw-fsi/cuckoofs

Folders and files

Latest commit

History

Repository files navigation

CuckooFS

Documents

Performance

Build

Authors

Copyright

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages