CephFS best practices

This guide provides recommendations for best results when deploying CephFS.

For the actual configuration guide for CephFS, please see the instructionsat Ceph File System.

Which Ceph version?

Use at least the Jewel (v10.2.0) release of Ceph. This is the firstrelease to include stable CephFS code and fsck/repair tools. Make sureyou are using the latest point release to get bug fixes.

Note that Ceph releases do not include a kernel, this is versionedand released separately. See below for guidance of choosing anappropriate kernel version if you are using the kernel clientfor CephFS.

Most stable configuration

Some features in CephFS are still experimental. SeeExperimental Features for guidance on these.

For the best chance of a happy healthy file system, use a single active MDSand do not use snapshots. Both of these are the default.

Note that creating multiple MDS daemons is fine, as these will simply beused as standbys. However, for best stability you should avoidadjusting max_mds upwards, as this would cause multiple MDSdaemons to be active at once.

Which client?

The FUSE client is the most accessible and the easiest to upgrade to theversion of Ceph used by the storage cluster, while the kernel client willoften give better performance.

The clients do not always provide equivalent functionality, for examplethe fuse client supports client-enforced quotas while the kernel clientdoes not.

When encountering bugs or performance issues, it is often instructive totry using the other client, in order to find out whether the bug wasclient-specific or not (and then to let the developers know).

Which kernel version?

Because the kernel client is distributed as part of the linux kernel (notas part of packaged ceph releases),you will need to consider which kernel version to use on your client nodes.Older kernels are known to include buggy ceph clients, and may not supportfeatures that more recent Ceph clusters support.

Remember that the “latest” kernel in a stable linux distribution is likelyto be years behind the latest upstream linux kernel where Ceph developmenttakes place (including bug fixes).

As a rough guide, as of Ceph 10.x (Jewel), you should be using a least a4.x kernel. If you absolutely have to use an older kernel, you should usethe fuse client instead of the kernel client.

This advice does not apply if you are using a linux distribution thatincludes CephFS support, as in this case the distributor will be responsiblefor backporting fixes to their stable kernel: check with your vendor.

Reporting issues

If you have identified a specific issue, please report it with as muchinformation as possible. Especially important information:

  • Ceph versions installed on client and server

  • Whether you are using the kernel or fuse client

  • If you are using the kernel client, what kernel version?

  • How many clients are in play, doing what kind of workload?

  • If a system is ‘stuck’, is that affecting all clients or just one?

  • Any ceph health messages

  • Any backtraces in the ceph logs from crashes

If you are satisfied that you have found a bug, please file it onthe tracker. For more general queries please writeto the ceph-users mailing list.