Rich Freeman on 25 Apr 2016 16:44:38 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] [plug-announce] TOMORROW - Tue, Apr 19, 2016: PLUG North - "Linux Containers" by Jason Plum and Rich Freeman (*6:30 pm* at CoreDial in Blue Bell) |
On Mon, Apr 25, 2016 at 5:45 PM, Keith C. Perry <kperry@daotechnologies.com> wrote: > > True but again, there always a situation that can blow up a state machine. We've just gotten better at it and at this point hard drives are consider very reliable. However, we still do back-ups because hard drives do fail. This has nothing to do with the file system but is does have something to do with the entire process of successfully storing and retrieving data. People aren't going to stop doing back ups because they are running BTRFS, at least I hope not :D Sure, and perhaps that is also one of the other appeals of zfs/btrfs: they're much more efficient to backup. Both support efficiently generating a stream containing all the changes between two snapshots, and re-creating the filesystem from that stream. You could use that to drive a replica of the filesystem offsite, or you could just store those streams in files on tapes/etc and replay them to restore a backup. Sure, you can generate incremental backups from any filesystem, but btrfs/zfs allow you to do it in a way that: 1. Is 100% reliable (that is, no changes will ever be missed). 2. Does not require reading/diffing/hashing/etc all the data on the filesystem, or even reading every directory tree on the filesystem. Software like rsync usually can do #1 but usually don't because it requires reading all the files to compare hashes (it isn't a default). Rsync in its default gets close to #2 but still requires reading every directory entry, and probably every inode (though I'm not 100% sure on the latter). zfs/btrfs take advantage of the COW design of the filesystem to rapidly identify parts of the filesystem that have diverged and only back up those, much as git can diff two commits without having to read every file or even every directory tree. In the case of btrfs if the root of the tree has 9/10 nodes in common between two snapshots then it knows nothing under those nodes has changed, and then only 10% of the entire metadata (at most) needs to be examined to find where the differences are. At each level of the tree there is again the opportunity to logrithmically eliminate portions of the search space. A balanced btree can store an incredible number of records accessible by only a few seeks. I don't dispute that btrfs is still fairly experimental. But, we're talking about the Fedora team here as well, and as I said I'm sure if somebody handed them code for an LVM version they'd probably accept it. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug