Bcachefs, an introduction/exploration


Introduction & background information

NOTE: This content is from an internal talk I gave, thus the reason it may read like a presentation

So what is bcachefs?

bcachefs is a next-generation copy-on-write (COW) filesystem (FS) that aims to provide features similar to Btrfs and ZFS, written by Kent Overstreet

Why the need for another FS?

According to Kent[1], paraphrased here

  1. https://www.patreon.com/bcachefs/about?l=en
  2. Current Fedora workstation default filesystem

Why bcachefs[1]

  1. Paraphrased https://www.patreon.com/bcachefs/about?l=en

Code size comparison

lines of kernel code (tokei utility)

Is bcachefs really that small? It’s actually pretty good, but some features are not fully complete.

Feature Comparison

(FS residing directly on hardware)

FS RAID Encryption Thin provision De-dupe Caching Compression Snapshots Subvolumes Send/Receive Full checksum Reflink
Bcachefs Y Y N N Y Y Y Y P Y Y
ZFS Y Y Y(sparse volumes) Y Y Y Y Y Y Y Y
btrfs Y N N N N Y Y Y Y Y Y
XFS N N N N N N N N N N Y (if newer)
ext3/4 N N N N N N N N N N N
Most non-COW N N N N N N N N N N Varies

P = Planned

Feature Comparison (layered)

FS RAID Encryption Thin provision De-dupe Caching Compression Snapshots Subvolumes Send/Receive Full checksum
Bcachefs Y Y-native N N Y Y Y Y P Y
ZFS Y Y-native Y(sparse volumes) ? Y Y Y Y Y Y
btrfs Y Y-native L L Y Y Y Y Y Y
Stratis P Y(dm-crypt) Y(dm-thin) P(vdo) Y P(vdo) Y(dm) N N N(dm-integrity)
non-COW using LVM Y Y(dm-crypt) Y(dm-thin) Y(vdo) Y Y(vdo) Y(LVM) N P(blk-archive) N(dm-integrity)

P = Planned L = layer on LVM/VDO ? = Should work layered on VDO, untested

Full checksums

General observations

How helpful is help?

(-help & man pages)

What are user space errors like?

Some use cases and common questions

Create FS, mount at boot

Mount option for degraded

Automation

sysfs interface

debugfs interface

/sys/kernel/debug/bcachefs/<FS UUID>

Many things under this directory, I haven’t found complete documentation for them yet, would likely need to examine code.

Information about a FS (root privs. not needed)

$ bcachefs fs usage /mnt/bcachefs/
Filesystem: 801d8d4e-5fb7-4697-a203-a2f7895f66aa
Size:                 	1975685120
Used:                 	1975685120
Online reserved:           843776

Data type   	Required/total  Durability	Devices
btree:      	1/1         	1         	[sdh]              9437184
user:       	1/1         	1         	[sdh]           1945206784

(no label) (device 0):       	sdh          	rw
                            	data     	buckets	fragmented
  free:                    172490752         658
  sb:                        3149824          13    	258048
  journal:                  16777216          64
  btree:                     9437184          36
  user:                   1945206784        7421    	163840
  cached:                          0           0
  parity:                      	   0           0
  stripe:                      	   0           0
  need_gc_gens:                	   0           0
  need_discard:                    0           0
  capacity:               2147483648        8192

Fill a FS 100% and then enable compression

Use different physical sector sizes

can’t always do this, not allowed with LVM & Stratis

Stack bcachefs on thin LV or vdo

(no manual fstrim support)

Can I use bcachefs on SMR & raw flash?

RAID

What about RAID 0,10,5,6 etc.

RAID0 experiment

RAID1

RAID10 (cannot do)

RAID 5/6 (experimental)

fs usage with erasure coding

# bcachefs fs usage /mnt/ec
Filesystem: 8f8b9a5e-afea-49c7-a245-86e132263bd4
Size:                 53343494144
Used:                  	896942080
Online reserved:               	0

Data type   	Required/total  Durability	Devices
btree:      	1/3         	3         	[sdf sdg sdh]   	15728640
user:       	1/2         	2         	[sdf sdh]         	  524288
user:       	1/2         	2         	[sdf sdg]        	 1458176
user:       	2/3         	3         	[sdf sdg sdh]  	   277348352
parity:     	2/3         	3         	[sdf sdg sdh]  	   138674176

Misc. information, things tried

Why is my free space < expected?

multipath

Block device management

Error handling on CRC read error

Things not explored

Conclusion

Work in progress

Specific areas that could use some help

Revisions