Suppose we wish to list the names and sizes of all the files in a directory (specified on the command line) which are larger than 50,000 bytes. We could use opendir and readdir as follows:
$thedir = $ARGV[0]; opendir(DIR,$thedir) || die "Couldn't open $thedir"; while(defined($filename = readdir(DIR))){ if(-s "$thedir/$filename" > 50000){ printf("%s: %d\n",$filename,-s _); } }Note that readdir returns only the tail of the filename, not its complete path. Thus, when using the
-s
operator, it was necessary
to precede the filename with the directory name which was passed to opendir.
The printf function was used since the expression -s _
would not
be interpolated within the double quotes.
While it would be possible to write a recursive program which would descend down
all the subdirectories encountered in the while loop of the previous
example, if you need to recurse through an entire directory tree, you can use the
File::Find module to simplify the task. This module provides a function
called find, which is passed a function reference (explained below) and
a list of directories. The supplied function is then called for each regular
file in the directories. When a subdirectory is encountered, the find
function changes its working directory to that directory, and, when your function
is called, it sets $_ to the
basename of the current file, and stores the fully qualified file name in
the variable $File::Find::name
. This is the notation used in perl whenever
you need to refer to a variable or function within a module, when the name of the
variable or function has not been exported by that module. The value of the
current directory is similarly stored in the variable $File::Find::dir
.
For example, suppose
we wish to find the largest file in a given directory or any of its subdirectories.
First, we need to write a function which will determine if a given file is
bigger than the biggest one encountered. The function passed to find does
not accept a filename as an argument; instead it relies on the variables described
in the previous paragraph, namely $_ and $File::Find::name
.
A function for the
current problem might look like this:
sub bigfile{ if(-s > $biggest){ $biggest = -s _; $bigfile = $File::Find::name; } }The global variables
$biggest
and $bigfile
are used to
hold the results of the search. To create a subroutine reference to pass to
the find function, precede the name of the function with a backslash and
an ampersand (\&); this is required so that perl will not confuse the
reference to the function with a reference to a scalar or array. As in previous
examples, we'll assume that the starting directory is passed through the command
line. Since find changes the working directory as it recurses the
directory tree, all file tests can be performed on the variable $_, which
contains just the basename of the file. The fully qualified pathname is stored in
$File::Find::name
, and is used here to record the full name of the largest
file encountered.
use File::Find; find(\&bigfile,$ARGV[0]); print "file=$bigfile, size=$biggest\n";