brent timothy saner on 20 Nov 2011 11:55:17 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] How to Find Most Used Files

Hash: SHA1

On 11/20/11 14:24, Casey Bralla wrote:
> I'd like to get records of which files are accessed/read/written to the
> most over a period of time.
> I'm not looking for which files are open at any one time, but rather
> which file has the most traffic to/from it over time.
> Anybody know how to do this?

Right off the top of my head, the only way I can think of how to do that
would be to do this:

1. daily via cron (assuming /home/user/ exists),

find /path/to/directory/ -atime -1 -print | sort >
/home/user/`date '+%m%d%Y'`

2. after a couple days of this cron running, run this:

cd /home/user/;for i in `ls -1A`;do cat $i >
/tmp/filelist.common1;done; uniq -c /tmp/filelist.common1 | sort -n >

(or remove the very last redirection if you want it printed out.)

a couple notes on this:

1. you'll want to keep the /home/user/ intact; the more files
you have in there the more accurate it will be.

2. output should be in the following format

10 ./directory/most.popular.file
8  ./second.most.popular.file
5  ./third.most.popular.file


the number at the beginning is a count of how often it occurs in the

3. THIS IS IMPORTANT: you MUST have the disk mounted with the "atime"
option, otherwise you'll have to rely only on modified time in the find
command (mtime). i believe noatime is now the default, which means
atimes aren't updated.

4. you may want to run several versions of this so you have one set of
statistics for atime, one for mtime, etc.
just remember to have unique directory sets for them (i.e.
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla -

Philadelphia Linux Users Group         --
Announcements -
General Discussion  --