Rich Freeman on 9 Dec 2011 20:57:56 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Lost gigabytes?

On Fri, Dec 9, 2011 at 10:44 PM, Lee H. Marzke <> wrote:
> With ZFS you Normally group your disks
> into sets of 2-disk mirrors, or 3-disk RaidZ vDevs and the zpool
> is then striping data across all of them.
> You then create folders and set quota.  Normally you only have
> a single zpool.  So you don't create data inside the RaidZ, you
> use many RaidZ mirror sets to form the pool and then allocate from that
> so you would never need to resize anything,  it's all done with quota.

So, my use case is I have 3 1TB drives.  A year later I buy 1 2TB drive.

If I had a linux md raid5 striped across those 3 1TB drives I'd just
extend the RAID across the new drive.  The end result either way is
that I have one more TB of extra usable space, since I've already paid
the parity penalty.  I'd also have 1TB of wasted space unless I wanted
to partition it out and use it as non-redundant storage.

If you can't reshape then your only options are to find someplace to
dump all that data while you re-create a new RAID, or be forced to buy
multiple drives at a time, and then only get usable storage from n-1
of them since you're creating two raid5s and pooling them.

I suspect that there really isn't any fundamental reason why ZFS
couldn't do reshaping of a RAID - I just suspect that the sorts of
people who use it aren't so likely to need to add a single drive, or
maybe they just didn't get around to it.

I think btrfs plans to support reshaping, but of course they haven't
implemented raid5 at all yet.  You can add/remove at any time with
raid1, though obviously at any time you can't use more than half the
space on the drives, and if you mismatch them you might or might not
get less (since it isn't striped you could in theory get half the
space out of 2 1TB and 1 2TB drive - if the filesystem is smart

> ZFS uses "ditto blocks' to save multiple copies of very important
> blocks, such as directory meta-data.  This is done even on a single
> disk without RAID.    RAIDZ on ZFS uses variable length stripes and
> is non-standard that way,  and also has 'birth-times' and checksums
> in the meta-data - so that is extremely similar to Btrfs.

Yup - from what I understand of both they are fairly comparable in
this regard.  Btrfs gives you the option of using any (implemented)
raid level for both metadata and data - the default is raid0 for data
and raid1 for metadata.  If you have a single drive it will still keep
two copies of the metadata by default.  You can actually tweak that at
the individual file level - so you could tell the filesystem to keep
six copies of some important file.  Some of the more critical
superblock-level stuff is always redundant.

The main issue with btrfs at this point is maturity - the
already-implemented feature list on paper is probably enough to make
the filesystem very usable.  The problem would be when your kernel
panics when you hit 70% full on a disk, or some other craziness.  If
you do want to experiment with it get the development version in git -
this is one of those cases where bleeding edge really is better, since
the versions in released kernels have MANY serious known issues.

In the future, however, it would be really nice to have the linux
default filesystem support writable snapshots and such - if you have
disk space free you could do worry-free upgrades.  Oh, and unlike LVM
all the disk space is pooled so the only way you can't do a snapshot
is if you don't have space to do anything else either.  I think you
can use quotas/etc to prevent root from filling up and such - need to
read up on that...

Philadelphia Linux Users Group         --
Announcements -
General Discussion  --