File layouts

The layout of a file controls how its contents are mapped to Ceph RADOS objects. You canread and write a file’s layout using virtual extended attributes or xattrs.

The name of the layout xattrs depends on whether a file is a regular file or a directory. Regularfiles’ layout xattrs are called ceph.file.layout, whereas directories’ layout xattrs are calledceph.dir.layout. Where subsequent examples refer to ceph.file.layout, substitute dir as appropriatewhen dealing with directories.

Tip

Your linux distribution may not ship with commands for manipulating xattrs by default,the required package is usually called attr.

Layout fields

  • pool
  • String, giving ID or name. Which RADOS pool a file’s data objects will be stored in.

  • pool_namespace

  • String. Within the data pool, which RADOS namespace the objects willbe written to. Empty by default (i.e. default namespace).

  • stripe_unit

  • Integer in bytes. The size (in bytes) of a block of data used in the RAID 0 distribution of a file. All stripe units for a file have equal size. The last stripe unit is typically incomplete–i.e. it represents the data at the end of the file as well as unused “space” beyond it up to the end of the fixed stripe unit size.

  • stripe_count

  • Integer. The number of consecutive stripe units that constitute a RAID 0 “stripe” of file data.

  • object_size

  • Integer in bytes. File data is chunked into RADOS objects of this size.

Tip

RADOS enforces a configurable limit on object sizes: if you increase CephFSobject sizes beyond that limit then writes may not succeed. The OSDsetting is osd_max_object_size, which is 128MB by default.Very large RADOS objects may prevent smooth operation of the cluster,so increasing the object size limit past the default is not recommended.

Reading layouts with getfattr

Read the layout information as a single string:

  1. $ touch file
  2. $ getfattr -n ceph.file.layout file
  3. # file: file
  4. ceph.file.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data"

Read individual layout fields:

  1. $ getfattr -n ceph.file.layout.pool file
  2. # file: file
  3. ceph.file.layout.pool="cephfs_data"
  4. $ getfattr -n ceph.file.layout.stripe_unit file
  5. # file: file
  6. ceph.file.layout.stripe_unit="4194304"
  7. $ getfattr -n ceph.file.layout.stripe_count file
  8. # file: file
  9. ceph.file.layout.stripe_count="1"
  10. $ getfattr -n ceph.file.layout.object_size file
  11. # file: file
  12. ceph.file.layout.object_size="4194304"

Note

When reading layouts, the pool will usually be indicated by name. However, inrare cases when pools have only just been created, the ID may be output instead.

Directories do not have an explicit layout until it is customized. Attempts to readthe layout will fail if it has never been modified: this indicates that layout of thenext ancestor directory with an explicit layout will be used.

  1. $ mkdir dir
  2. $ getfattr -n ceph.dir.layout dir
  3. dir: ceph.dir.layout: No such attribute
  4. $ setfattr -n ceph.dir.layout.stripe_count -v 2 dir
  5. $ getfattr -n ceph.dir.layout dir
  6. # file: dir
  7. ceph.dir.layout="stripe_unit=4194304 stripe_count=2 object_size=4194304 pool=cephfs_data"

Writing layouts with setfattr

Layout fields are modified using setfattr:

  1. $ ceph osd lspools
  2. 0 rbd
  3. 1 cephfs_data
  4. 2 cephfs_metadata
  5.  
  6. $ setfattr -n ceph.file.layout.stripe_unit -v 1048576 file2
  7. $ setfattr -n ceph.file.layout.stripe_count -v 8 file2
  8. $ setfattr -n ceph.file.layout.object_size -v 10485760 file2
  9. $ setfattr -n ceph.file.layout.pool -v 1 file2 # Setting pool by ID
  10. $ setfattr -n ceph.file.layout.pool -v cephfs_data file2 # Setting pool by name

Note

When the layout fields of a file are modified using setfattr, this file must be empty, otherwise an error will occur.

  1. # touch an empty file
  2. $ touch file1
  3. # modify layout field successfully
  4. $ setfattr -n ceph.file.layout.stripe_count -v 3 file1
  5.  
  6. # write something to file1
  7. $ echo "hello world" > file1
  8. $ setfattr -n ceph.file.layout.stripe_count -v 4 file1
  9. setfattr: file1: Directory not empty

Clearing layouts

If you wish to remove an explicit layout from a directory, to revert toinheriting the layout of its ancestor, you can do so:

  1. setfattr -x ceph.dir.layout mydir

Similarly, if you have set the pool_namespace attribute and wishto modify the layout to use the default namespace instead:

  1. # Create a dir and set a namespace on it
  2. mkdir mydir
  3. setfattr -n ceph.dir.layout.pool_namespace -v foons mydir
  4. getfattr -n ceph.dir.layout mydir
  5. ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data_a pool_namespace=foons"
  6.  
  7. # Clear the namespace from the directory's layout
  8. setfattr -x ceph.dir.layout.pool_namespace mydir
  9. getfattr -n ceph.dir.layout mydir
  10. ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data_a"

Inheritance of layouts

Files inherit the layout of their parent directory at creation time. However, subsequentchanges to the parent directory’s layout do not affect children.

  1. $ getfattr -n ceph.dir.layout dir
  2. # file: dir
  3. ceph.dir.layout="stripe_unit=4194304 stripe_count=2 object_size=4194304 pool=cephfs_data"
  4.  
  5. # Demonstrate file1 inheriting its parent's layout
  6. $ touch dir/file1
  7. $ getfattr -n ceph.file.layout dir/file1
  8. # file: dir/file1
  9. ceph.file.layout="stripe_unit=4194304 stripe_count=2 object_size=4194304 pool=cephfs_data"
  10.  
  11. # Now update the layout of the directory before creating a second file
  12. $ setfattr -n ceph.dir.layout.stripe_count -v 4 dir
  13. $ touch dir/file2
  14.  
  15. # Demonstrate that file1's layout is unchanged
  16. $ getfattr -n ceph.file.layout dir/file1
  17. # file: dir/file1
  18. ceph.file.layout="stripe_unit=4194304 stripe_count=2 object_size=4194304 pool=cephfs_data"
  19.  
  20. # ...while file2 has the parent directory's new layout
  21. $ getfattr -n ceph.file.layout dir/file2
  22. # file: dir/file2
  23. ceph.file.layout="stripe_unit=4194304 stripe_count=4 object_size=4194304 pool=cephfs_data"

Files created as descendents of the directory also inherit the layout, if the intermediatedirectories do not have layouts set:

  1. $ getfattr -n ceph.dir.layout dir
  2. # file: dir
  3. ceph.dir.layout="stripe_unit=4194304 stripe_count=4 object_size=4194304 pool=cephfs_data"
  4. $ mkdir dir/childdir
  5. $ getfattr -n ceph.dir.layout dir/childdir
  6. dir/childdir: ceph.dir.layout: No such attribute
  7. $ touch dir/childdir/grandchild
  8. $ getfattr -n ceph.file.layout dir/childdir/grandchild
  9. # file: dir/childdir/grandchild
  10. ceph.file.layout="stripe_unit=4194304 stripe_count=4 object_size=4194304 pool=cephfs_data"

Adding a data pool to the MDS

Before you can use a pool with CephFS you have to add it to the Metadata Servers.

  1. $ ceph fs add_data_pool cephfs cephfs_data_ssd
  2. $ ceph fs ls # Pool should now show up
  3. .... data pools: [cephfs_data cephfs_data_ssd ]

Make sure that your cephx keys allows the client to access this new pool.

You can then update the layout on a directory in CephFS to use the pool you added:

  1. $ mkdir /mnt/cephfs/myssddir
  2. $ setfattr -n ceph.dir.layout.pool -v cephfs_data_ssd /mnt/cephfs/myssddir

All new files created within that directory will now inherit its layout and place their data in your newly added pool.

You may notice that object counts in your primary data pool (the one passed to fs new) continue to increase, even if files are being created in the pool you added. This is normal: the file data is stored in the pool specified by the layout, but a small amount of metadata is kept in the primary data pool for all files.