Finding Uses of a Specific Open File
Often you're interested in knowing who is using a specific file. You know the path to it and you want lsof to tell you the processes that have open references to it.
Simple -- execute lsof and give it the path name of the file of interest -- e.g.,
$ lsof /etc/passwd
Caveat: this only works if lsof has permission to get the status (via stat(2)) of the file at the named path. Unless the lsof process has enough authority -- e.g., it is being run with a real User ID (UID) of root.
Further caveat: this use of lsof will fail if the stat(2) kernel syscall returns different file parameters -- particularly device and inode numbers -- than lsof finds in kernel node structures. This condition is rare and is usually documented in the FAQ.
$ lsof /etc/security/passwd lsof: status error on /etc/security/passwd: Permission denied
Finding Open Files Filling a File System
Oh! Oh! /tmp is filling and ls doesn't show that any large files are being created. Can lsof help?
Maybe. If there's a process that is writing to a file that has been unlinked, lsof may be able to discover the process for you. You ask it to list all open files on the file system where /tmp is located.
Sometimes /tmp is a file system by itself. In that case,
$ lsof /tmp
is the appropriate command. If, however, /tmp is part of another file system, typically /, then you may have to ask lsof to list all files open on the containing file system and locate the offending file and its process by inspection -- e.g.,
$ lsof / | more # or $ lsof / | grep ...
Caveat: there must be a file open to a for the lsof search to succeed. Sometimes the kernel may cause a file reference to persist, even where there's no file open to a process. (Can you say kernel bug? Maybe.) In any event, lsof won't be able to help in this case.
Finding an Unlinked Open File
A pesky variant of a file that is filling a file system is an unlinked file to which some process is still writing. When a process opens a file and then unlinks it, the file's resources remain in use by the process, but the file's directory entries are removed. Hence, even when you know the directory where the file once resided, you can't detect it with ls.
This can be an administrative problem when the unlinked file is large, and the process that holds it open continues to write to it. Only when the process closes the file will its resources, particularly disk space, be released.
Lsof can help you find unlinked files on local disks. It has an option, +L, that will list the link counts of open files. That helps because an unlinked file on a local disk has a zero link count. Note: this is NOT true for NFS files, accessed from a remote server.
You could use the option to list all files and look for a zero link count in the NLINK column -- e.g.,
$ lsof +L COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME ... less 25366 abe txt VREG 6,0 40960 1 76319 /usr/... ... > less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
Better yet, you can specify an upper bound to the +L option, and lsof will select only files that have a link count less than the upper bound. For example:
$ lsof +L1 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
You can use lsof's -a (AND) option to narrow the link count search to a particular file system. For example, to look for zero link counts on the /home file system, use:
$ lsof -a +L1 /home
CAUTION: lsof can't always report link counts for all file types -- e.g., it may not report them for FIFOs, pipes, or sockets. Remember also that link counts for NFS files on an NFS client host don't behave as do link counts for files on local disks.
Finding Processes Blocking Umount
When you need to unmount a file system with the umount command, you may find the operation blocked by a process that has a file open on the file systems. Lsof may be able to help you find the process. In response to:
$ lsof <file_system_name>
Lsof will display all open files on the named file system. It will also set its exit code zero when it finds some open files and non-zero when it doesn't, making this type of lsof call useful in shell scripts. (See section 16.)
Consult the output of the df command for file system names.
See the caveat in the preceding section about file references that persist in the kernel without open file traces. That situation may hamper lsof's ability to help with umount, too.
Finding Listening Sockets
Sooner or later you may wonder if someone has installed a network server that you don't know about. Lsof can list for you all the network socket files open on your machine with:
$ lsof -i
The -i option without further qualification lists all open Internet socket files. You can add network names or addresses, protocol names, and service names or port numbers to the -i option to refine the search. (See the next section.)
Finding a Particular Network Connection
When you know the source or destination of a network connection whose open files and process you'd like to identify, the -i option may help.
If, for example, you want to know what process has a connection open to or from the Internet host named aaa.bbb.ccc, you can ask lsof to search for it with:
$ lsof -firstname.lastname@example.org
If you're interested in a particular protocol -- TCP or UDP -- and a specific port number or service name, you can add those discriminators to the -i information:
$ lsof -iTCP@aaa.bbb.ccc:ftp-data
If you're interested in a particular IP version -- IPv4 or IPv6 -- and your UNIX dialect supports both (It does if "IPv" appears in the lsof -h output.), you can add the '4' or '6' selector immediately after -i:
$ lsof -i4 $ lsof -i6
Identifying a Netstat Connection
How do I identify the process that has a network connection described in netstat output? For example, if netstat says:
Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 vic.1023 ipscgate.login ESTABLISHED
What process is connected to service name
login on ipscgate?
Use lsof's -i option:
$ lsof -iTCP@ipscgate:login COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login ...
There's another way. Notice the 0x10144168 in the DEVICE column of the lsof output? That's the protocol control block (PCB) address. Many netstat applications will display it when given the -A option:
$ netstat -A PCB Proto Recv-Q Send-Q Local Address Foreign Address (state) 10144168 tcp 0 0 vic.1023 ipscgate.login ESTABLISHED ...
Using the PCB address, lsof, and grep, you can find the process this way, too:
$ lsof -i | grep 10144168 rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login ...
If the file is a UNIX socket and netstat reveals and address for it, like this Solaris 11 example:
$ netstat -a -f unix Active UNIX domain sockets Address Type Vnode Conn Local Addr Remote Addr ffffff0084253b68 stream-ord 0000000 0000000
Using lsof's -U option and its output piped to a grep on the address yields:
$ lsof -U | grep ffffff0084253b68 squid 1638 nobody 12u unix 18,98 0t10 9437188 /devices/pseudo/tl@0:ticots->0xffffff0084253b68 stream-ord
Finding Files Open to a Named Command
When you want to look at the files open to a particular command, you can look up the PID of the process running the command and use lsof's -p option to specify it.
$ lsof -p <PID>
However, there's a quicker way, using lsof's -c option, provided you don't mind seeing output for every process running the named command.
$ lsof -c <first_characters_of_command_name_that_interest_you>
The lsof -c option is useful when you want to see how many instances of a given command are executing and what their open files are. One useful example is for the sendmail command.
$ lsof -c sendmail
Deciphering the Remote Login Trail
If the network connection you're interested in tracing has been initiated externally and is connected to an rlogind, sshd, or telnetd process, asking lsof to identify that process might not give a wholly satisfying answer. The report may be that the connection exists, but to a process owned by root.
How do you get from there to the login name really using the connection? You have to know a little about how real and pseudo ttys are paired in your system, and then use several lsof probes to identify the login.
This example comes from a Solaris 2.4 system, named klaatu.cc. I've logged on to it via rlogin from lsof.itap. The first lsof probe,
$ lsof -email@example.com
yields (among other things):
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023 ...
This confirms that a connection exists. A second lsof probe shows:
$ lsof -p7362 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ... in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023 ... in.rlogin 7362 root 3u VCHR 23, 0 0t66 52928 /devices/pseudo/clone@0:ptmx->pckt->ptm
7362 is the Process ID (PID) of the in.rlogin process, discovered in the first lsof probe. (I've abbreviated the output to simplify the example.) Now comes a need to understand Solaris pseudo-ttys. The key indicator is in the DEVICE column for FD 3, the major/minor device number of 23,0. This translates to /dev/pts/0, so a third lsof probe,
$ lsof /dev/pts/0 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ksh 7364 abe 0u VCHR 24, 0 0t2410 53410 /dev/pts/../../devices/pseudo/pts@0:0
shows in part that login abe has a ksh process on /dev/pts/0. (The NAME that lsof shows is not /dev/pts/0 but the full expansion of the symbolic link that lsof finds at /dev/pts/0.)
Here's a second example, done on an HP-UX 9.01 host named ghg.ecn. Again, I've logged on to it from lsof.itap, so I start with:
$ lsof -firstname.lastname@example.org COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME rlogind 10214 root 0u inet 0x041d5f00 0t1536 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023 ...
$ lsof -p10214 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ... rlogind 10214 root 0u inet 0x041d5f00 0t2005 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023 ... rlogind 10214 root 3u VCHR 16,0x000030 0t2037 24642 /dev/ptym/ptys0
Here the key is the NAME /dev/ptym/ptys0. In HP-UX 9.01 tty and pseudo tty devices are paired with the names like /dev/ptym/ptys0 and /dev/pty/ttys0, so the following lsof probe is the final step.
$ lsof /dev/pty/ttys0 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ksh 10215 abe 0u VCHR 17,0x000030 0t3399 22607 /dev/pty/ttys0 ...
Here's a third example for an AIX 4.1.4 system. I've used telnet to connect to it from lsof.itap.purdue.edu. I start with:
$ lsof -email@example.com COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ... telnetd 15616 root 0u inet 0x05a93400 0t5156 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369
Then I look at the telnetd process:
$ lsof -p15616 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ... telnetd 15616 root 0u inet 0x05a93400 0t5641 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369 ... telnetd 15616 root 3u VCHR 25, 0 0t5493 103 /dev/ptc/0
Here the key is /dev/ptc/0. In AIX it's paired with /dev/pts/0. The last probe for that shows:
$ lsof /dev/pts/0 COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME ... ksh 16642 abe 0u VCHR 26, 0 0t6461 360 /dev/pts/0
The idrlogin.perl Scripts
There's another, perhaps easier way, to go about the job of tracing a network connection. The lsof distribution contains two Perl scripts, idrlogin.perl (Perl 4) and idrlogin.perl5 (Perl 5), that use lsof field output to display values for shells that are parented by rlogind, sshd, or telnetd, or connected directly to TCP sockets. The lsof test suite contains a C library that can be adapted for use with C programs that need to call lsof and process its field output.
The two Perl scripts use the lsof -R option; it causes the paRent process ID (PPID) to be listed in the lsof output. The scripts identify all shell processes -- e.g., ones whose command names end in ``sh'' -- and determine if: 1) the ultimate ancestor process before a PID greater than 2 (e.g., init's PID is 1) is rlogind, sshd, or telnetd; or 2) the shell process has open TCP socket files.
Here's an example of output from idlogin.perl on a Solaris 2.4 system:
centurion: 1 = cd src/lsof4/scripts centurion: 2 = ./idrlogin.perl Login Shell PID Via PID TTY From oboyle ksh 12640 in.telnetd 12638 pts/5 opal.cc.purdue.edu icdtest ksh 15158 in.rlogind 15155 pts/6 localhost sh csh 18207 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu root csh 18242 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu trouble ksh 19208 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu abe ksh 21334 in.rlogind 21332 pts/2 lsof.itap.purdue.edu
The scripts assume that its parent directory contains an executable lsof. If you decide to use one of the scripts, you may want to customize it for your local lsof and perl paths.
Note that processes executing as remote shells are also identified.
Here's another example from a UnixWare 7.1.0 system.
tweeker: 1 = cd src/lsof4/scripts tweeker: 9 = ./idrlogin.perl Login Shell PID Via PID TTY From abe ksh 9438 in.telnetd 9436 pts/3 lsof.itap.purdue.edu
Watching an Ftp or Rcp Transfer
The nature of the Internet being one of unpredictable performance at times, occasionally you want to know if a file transfer, being done by ftp or rcp, is making any progress.
To use lsof for watching a file transfer, you need to know the PID of the file transfer process. You can use ps to find that. Then use lsof,
$ lsof -p<PID>
to examine the files open to the transfer process. Usually the ftp files or interest are at file descriptors 9 and 10 or 10 and 11; for rcp, 3 and 4. They describe the network socket file and the local data file.
If you want to watch only those file descriptors as the file transfer progresses, try these lsof forms (for ftp in the example):
$ lsof -p<PID> -ad9,10 -r # or $ lsof -p<PID> -ad10,11 -r
Some options need explaining:
specifies that lsof is to restrict its attention to the process whose ID is . You can specify a set of PIDs by separating them with commas.
$ lsof -p 1234,5678,9012
-a specifies that lsof is to AND its tests together. The two tests that are specified are tests on the PID and tests on file descriptions (
d9,10 specifies that lsof is to test only file descriptors 9 and 10. Note that the
-is absent, since
-ais a unary option and can be followed immediately by another lsof option.
-r tells lsof to list the requested open file information, sleep for a default 15 seconds, then list the open file information again. You can specify a different time (in seconds) after -r and override the default. Lsof issues a short line of equal signs between each set of output to distinguish it.
For an rcp transfer, the above example becomes:
$ lsof -p<PID> -ad3,4 -r
Listing Open NFS Files
Lsof will list all files open on remote file systems, supported by an NFS server. Just use:
$ lsof -N
Note, however, that when run on an NFS server, lsof will not list files open to the server from one of its clients. That's because lsof can only examine the processes running on the machine where it is called -- i.e., on the NFS server.
If you run lsof on the NFS client, using the -N option, it will list files open by processes on the client that are on remote NFS file systems.
Listing Files Open by a Specific Login
If you're interested in knowing what files the processes owned by a particular login name have open, lsof can help.
$ lsof -u<login> # or $ lsof -u<User ID number>
You can specify either the login name or the UID associated with it. You can specify multiple login names and UID numbers, mixed together, by separating them with commas.
$ lsof -u548,abe
On the subject of login names and UIDs, it's worth noting that lsof can be told to report either. By default it reports login names; the -l option switches reporting to UIDs. You might want to use -l if login name lookup is slow for some reason.
Ignoring a Specific Login
The -u option can also be used to direct lsof to ignore a
specific login name or UID, or a list of them. Simply prefix
the login names or UIDs with a
^ character, as you might do
in a regular expression. The
^ prefix is useful, for example,
when you want to have lsof ignore the files open to system
processes, owned by the root (UID 0) login. Try:
$ lsof -u ^root # or $ lsof -u ^0
Listing Files Open to a Specific Process Group
There's a Unix collection of processes called a process group. The name indicates that the processes of the group have a common association and are grouped so that a signal sent to one (e.g., a keyboard kill stroke) is delivered to all.
This causes Unix to create a two element process group:
$ lsof | less
You can use lsof to look at the open files of all members of a process group, if you know the process group ID number. Assuming that it is 12717 for the above example, this lsof command:
$ lsof -g12717 -adcwd
would produce on a Solaris 8 system:
$ lsof -g12717 -adcwd COMMAND PID PGID USER FD TYPE DEVICE SIZE/OFF NODE NAME sshd 11369 12717 root cwd VDIR 0,2 189 1449175 /tmp (swap) sshd 12717 12717 root cwd VDIR 136,0 1024 2 /
-g12717' option specifies the process group ID of interest;
-adcwd option specifies that options are to be ANDed and
that lsof should limit file output to information about current
working directory (
Output for Other Programs
The -F option allows you to specify that lsof should describe open files with a special form of output, called field output, that can be parsed easily by a subsequent program. The lsof distribution comes with sample AWK, Perl 4, and Perl 5 scripts that post-process field output. The lsof test suite has a C library that could be adapted for use by C programs that want to process lsof field output from an in-bound pipe.
The lsof manual page describes field output in detail in its OUTPUT FOR OTHER PROGRAMS section. A quick look at a sample script in the scripts/ subdirectory of the lsof distribution will also give you an idea how field output works.
The most important thing about field output is that it is relatively homogeneous across Unix dialects. Thus, if you write a script to post-process field output for AIX, it probably will work for HP-UX, Solaris, and Ultrix as well.
Support for other formats e.g. JSON is planned.
The Lsof Exit Code and Shell Scripts
When lsof executes successfully, it returns an exit code based on the result of its search for specified files. (If no files were specified, then the successful exit code is 0 (zero).)
If lsof was asked to search for specific files, including any files on specified file systems, it returns an exit code of 0 (zero) if it found all the specified files and at least one file on each specified file system. Otherwise it returns a 1 (one) if any part of the search failed.
This behavior can be modified by calling lsof with -Q, which will tell it to provide a successful exit code of 0 (zero) even if any part of the file or filesystem search failed.
If lsof detects a generic (non-search) error during its execution, it returns an exit code of 1 (one). The -Q option will not affect this behavior.
You can use the exit code in a shell script to search for files on a file system and take action based on the result -- e.g.,
#!/bin/sh lsof <file_system_name> > /dev/null 2>&1 if test $? -eq 0 then echo "<file_system_name> has some users." else echo "<file_system_name> may have no users." fi
The -Q option can help in certain circumstances. For example, if you want to log filesystem users without caring if there are no users:
#!/bin/sh lsof -Q <file_system_name> > fs_users.log if test $? -ne 0 then echo "Error: Something actually went wrong!" 1>&2 exit 1 fi
Strange messages in the NAME column
When lsof encounters problems analyzing a particular file, it may put a message in the file's NAME column. Many of those messages are explained in the 00FAQ file of the lsof distribution.
So consult 00FAQ first if you encounter a NAME column message you don't understand. (00FAQ is a possible source of information about other unfamiliar things in lsof output, too.)
If you can't find help in 00FAQ, you can use grep to look in the lsof source files for the message -- e.g.,
$ cd .../lsof_4.76_src $ grep "can't identify protocol" *.[ch]
The code associated with the message will usually make clear the reason for the message.
If you have an lsof source tree that has been processed by the lsof Configure script, you need grep only there. If, however, your source tree hasn't been processed by Configure, you may have to look in the top-level lsof source directory and in the dialects sub-directory for the UNIX dialect you are using - e.g.,
$ cd .../lsof_4.76_src $ grep "can't identify protocol" *.[ch] $ cd dialects/Linux $ grep "can't identify protocol" *.[ch]
In rare cases you may have to look in the lsof library, too -- e.g.,
$ cd .../lsof_4.76_src $ grep "can't identify protocol" *.[ch] $ cd dialects/Linux $ grep "can't identify protocol" *.[ch] $ cd ../../lib $ grep "can't identify protocol" *.[ch]