JP Vossen on 11 Apr 2012 19:44:04 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Sort a file tree by last modified time


Date: Wed, 11 Apr 2012 11:52:07 -0400
From: Paul Jungwirth<pj@illuminatedcomputing.com>

Does anyone know how to get a list of files (including within
subdirectories) sorted by last-modified time? I'm looking for output
roughly like this:

[Feb 15 13:12]  content/code/encc.md
[Feb 15 13:12]  content/code/solsystem
[Feb 15 13:12]  content/code/solsystem/solsystem.tar.gz
[Feb 15 13:14]  content/code/upload
[Feb 15 13:15]  content/code/upload.md
[Feb 29 10:42]  content/code/launch4j-maven-plugin.md
[Feb 29 10:44]  content/code
[Mar 23 16:08]  content/widget.html

I don't really care how the dates are formatted or whether they appear
first or last. Bonus points if I can filter by file type (e.g. show
just regular files and symlinks) and choose whether to reverse the
sorting or not (though that's easy enough by piping to tac).


Assuming GNU find, do it all in 'find' and 'sort' really easily.

find /path/to/files -printf '%TY-%Tm-%Td_%TH:%TM:%TS\t%y\t%s\t%p\n' | sort -r

That gives you everything you want, very simply.

Output is:
ISO-8601 date/time <tab> type <tab> size <tab> path

ISO-8601 dates sort correctly.
%y File's type (like in ls -l),  U=unknown  type  (shouldn't happen)
	e.g., d = dir, f = file, l = symlink
%s File's size in bytes
%p File's name

man find, search for "-printf". VERY handy, except that the "%Tk" format specifier makes you pull your hair out until you realize you need to use it every time. You can't do "%TY-m-d", it's "%TY-%Tm-%Td". Which makes sense once you think about it, but until you do...hair removal. See also %A for last access time.

Sample:
$ find /etc -printf '%TY-%Tm-%Td_%TH:%TM:%TS\t%y\t%s\t%p\n' | sort -r | head
2012-04-08_06:48:50.0000000000  f   33616   /etc/package.list
2012-04-07_01:18:57.0000000000  f   2378    /etc/bind/db.jpsdomain.org
2012-04-07_01:18:57.0000000000  d   4096    /etc/bind
2012-04-07_01:06:17.0000000000  f   44268   /etc/bind/db.jpsdomain.o...
2012-04-04_05:05:52.0000000000  f   590596  /etc/.bzr/checkout/dirstate
2012-04-04_05:05:52.0000000000  f   270695  /etc/.etckeeper
2012-04-04_05:05:52.0000000000  d   4096    /etc/.bzr/repository/upload
2012-04-04_05:05:52.0000000000  d   4096    /etc/.bzr/checkout/lock
2012-04-04_05:05:52.0000000000  d   4096    /etc/.bzr/branch/lock
2012-04-04_05:05:42.0000000000  f   28844   /etc/ld.so.cache

Omit "%TS" to omit the seconds, which are admittedly ugly.

The "-print0 | xargs -0" trick is good one to remember, since it uses NULL separators and can thus deal with spaces in file names, but it's overkill here. Using more tools in the pipeline (e.g., 'ls', 'xargs', 'stat') than you need is likewise inefficient.

The <tab> delimiter, date format and such are all trivially tweakable. CSV output would be easy, or anything else you want. Dumping CSV into a modern spreadsheet with "auto-filter" capability (go find it or Google and play with it, very cool) would be very handy.

Later,
JP
----------------------------|:::======|-------------------------------
JP Vossen, CISSP            |:::======|      http://bashcookbook.com/
My Account, My Opinions     |=========|      http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug