Rich Freeman on 7 Aug 2012 08:18:08 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
[PLUG] UNIX File Equivalence |
Here is a good general UNIX question that somebody might know the answer to. On UNIX, or at least on Linux, how can I determine if two files are the same, preferably very quickly (ie via a system/library call and not a page of PERL - though I'd still be interested in scripting/bash solutions as well). By the same, I don't mean that their contents are the same - I mean that the files are actually the same file. For example, on my system right now the files /usr/tmp/xyz and /var/tmp/xyz are actually the same file (due to symlinks). Other mechanisms that could cause files to be the same would be hard links or bind mounts - either at the file level or anywhere in the path to them. I would consider reflinked files, copies of files, or snapshots of files to be different, even if at the moment their contents happen to be the same. Ideally this should not depend on filesystem-specific details, though something that requires simply that a filesystem comply with some accepted standard is fine (it should work on the big linux ones - ext(2/3/4)/xfs/btrfs/zfs/etc). My use case is a C function which takes as input a file and needs to determine if that file is one of the files in some list, regardless of how the path takes it there. So, solutions that can also come up with a deterministic canonical path for a file (even with bind mounts) or some kind of unique hash for a file would be even better (without reading the file - again I don't care about content being the same - just the file being the same, and reading content is slow anyway). I'd expect this function to be called EXTREMELY often so it should run on the order of milliseconds with a search list containing tens of thousands of path/filenames. If I had to exhaustively fully test the whole list I could probably add files that match to the list so that future searches would go faster (with just a direct path/name hit). Part of me is thinking there is some system call that should just do this, I just don't know what it is. Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug