2

[Perl] Finding Files, The Fun And Elegant Way


Posted by Artem Russakovskii on April 8th, 2009 in Awesomeness, Linux, Perl, Programming, Tutorials

Updated: October 6th, 2009

No matter what programming language you use, there comes a time when you need to search for a file somewhere on the file system. Here, I want to talk about accomplishing this task in Perl. There are many ways of doing so, most of them boring, but I want to discuss the fun and elegant way – using File::Find::Rule.

Let me briefly discuss some of the other methods first.

Limited

Using glob() (or <>, TODO verify) you can find files in a single directory, using only the limited shell wildcard support. For example,

1
my @files = glob("tmp*");

I prefer glob() to <> because glob()'s parameters can be more than just text (for ex functions) while <> treats everything inside as text.

Boring

File::Find is the de facto standard for searching in Perl.

This method finds files that end in .pl in "." and "../SomeDir", following symlinks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/perl -w
 
use File::Find;
use Data::Dumper;
use File::Basename;
my @directories_to_search = (".", "../SomeDir");
my @file_list = ();
 
find(
  { wanted =>
    sub {
      if ( basename($File::Find::name) =~ /\.pl$/i )
      {
        push @file_list, $File::Find::name;
      }
    },
    follow => 1
  },
  @directories_to_search
);
print Dumper @file_list;

It works fine, except it's horribly ugly and boring. Let's have a look at something more fun.

The Fun And Elegant Way

File::Find::Rule. Just have a look at this beauty.

Just like above, find all .pl files in "." and "../SomeDir", following symlinks:

1
2
print Dumper (File::Find::Rule->name("*.pl")->file->extras({ follow => 1 })->
in(".", "../SomeDir"));

Same as above, except bypass .svn directories (shaves off a ton of time with a lot of directories):

1
2
print Dumper (File::Find::Rule->not(File::Find::Rule->directory->name('.svn')->
prune->discard)->name("*.pl")->file->extras({ follow => 1 })->in(".", "../SomeDir"));

Find all .log files that are older than 24 hours in "."

1
2
3
my $epoch_time_1_day_ago = time() - 60*60*24;
print Dumper (File::Find::Rule->file->name("*.log")->
mtime("<$epoch_time_1_day_ago")->in('.'));

Be sure to read the File::Find::Rule perldoc for more options and remember: have fun with your code!

Thanks to Perlbuzz and Andy Lester for pointing me to this library a few months ago.

● ● ●

Artem Russakovskii is a San Francisco programmer, blogger, and future millionaire (that last part is in the works). Follow Artem on Twitter (@ArtemR) or subscribe to the RSS feed.

In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.



Share
  • http://petdance.com/ Andy Lester

    Glad you dig the File::Find::Rule, but I've always found it to be a pain to use. Rather than having to use plugin filters, I'd rather just write my own if necessary. So I wrote File::Next.

    use File::Next;

    my $iter =
    File::Next::files( {
    file_filter => sub { /\.pl$/ },
    descend_filter => sub { $_ ne '.svn' },
    }, '.', '../SomeDir' );

    while ( my $file = $iter->() ) {
    print "$file\n";
    }

    Like your previous example, that finds all .pl files under . and ../SomeDir, without looking into the .svn directories.

  • http://www.shlomifish.org/ Shlomi Fish

    I should note that I recently began work on File-Find-Object-Rule, which is a port of File-Find-Rule to File-Find-Object. This would eventually allow F-F-O-R to overcome some of File::Find::Rule inherent limitations, which are caused due to its reliance on File::Find.

    The interface of F-F-Object-Rule still remains pretty much backwards compatible to that of F-F-Rule, but some stuff (like "->start()" and "->match()") have become much saner.