Updated: 2017-01-20 00:48 EST

1 Disk UsageIndexup to index

The du command counts the number of disk blocks used in files and directories, and does it recursively for directories. You can turn off visible directory recursion using the -s option to du, and then du will show you only the sum total of disk blocks in that directory, including disk usage in all subdirectories underneath it.

The sizes of the directories themselves (in disk blocks) are included in the directory totals:

$ rm -rf new
$ mkdir new
$ du new
4 new

$ date >new/foo
$ du new
8 new

$ date >new/bar
$ du new
12 new

$ mkdir new/dir
$ du new
4 new/dir
16 new

$ du -s new
16 new

$ date >new/dir/foo
$ du new
8 new/dir
20 new

$ du -s new
20 new

$ rm -r new/*
$ du new
4 new

Note that even an empty directory takes up some disk space, and some file systems allocate a minimum number of disk blocks for any non-empty file or directory (e.g. 4 blocks, above).

In this document, we will assume a minimum of 4 blocks per non-empty file or directory. An empty file, such as created by touch, takes no disk space, since it doesn’t need any disk blocks:

$ rm -rf new; mkdir new ; du new
4 new
$ touch new/foo ; du new
4 new

Given these supposedly successful commands and output:

$ rm -rf new
$ mkdir new
$ cp file1 file2 file3 new
$ du -s new
132 new

QUESTION: If I removed all the files under new, how much disk space would I free up? (This is the same as asking: How many disk blocks are used by all the files under new? You must not include the disk space used by the directory itself. Only count the file space.)

ANSWER: The total disk space used in new (including the directory itself) is 132 blocks. We know 4 blocks are used for the directory itself, leaving 132-4=128 blocks for the all the files inside new. So 128 blocks would be freed up by removing all the files under new.

2 Disk Usage (du), Quotas, and Linked FilesIndexup to index

The quota command shows your disk quota, in disk blocks, if enabled. It also shows you the number of inodes you are using. A disk quota is a limit on how much disk space, and how many inodes, you can use on the system. (Quotas are enabled on the CLS.)

Linked files (files with multiple names) don’t take up extra disk blocks or inodes, and so they don’t affect the output of du or quota:

$ rm -rf new ; mkdir new ; date >new/foo
$ du new  ;  quota
8 new
Disk quotas for user cst8207a (uid 1002): 
Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
home           780  204800  512000             193       0       0        

$ ln new/foo new/bar
$ ln new/bar new/abc
$ ls -i new
3138961 abc  3138961 bar  3138961 foo

$ du new  ;  quota               # does not show any new blocks or inodes
8 new
Disk quotas for user cst8207a (uid 1002): 
Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
home           780  204800  512000             193       0       0        

$ ls -dils new/*
3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/abc
3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/bar
3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/foo

Since foo, bar, and abc are all names for the same disk blocks and the same inode, du and quota count the disk space and inode only once.

To actually free up disk space, all three names for inode 3138961 must be removed. Only then will du and quota show a reduction in space:

$ rm new/abc ; du new ; quota               # does not release disk space
8 new
Disk quotas for user cst8207a (uid 1002): 
Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
home           780  204800  512000             193       0       0        

$ rm new/bar ; du new ; quota               # does not release disk space
8 new
Disk quotas for user cst8207a (uid 1002): 
Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
home           780  204800  512000             193       0       0        

$ rm new/foo ; du new ; quota               # this releases disk space
4 new
Disk quotas for user cst8207a (uid 1002): 
Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
home           776  204800  512000             192       0       0        

To release disk space, all the names for an inode must be removed.

3 Finding Linked filesIndexup to index

For a file with multiple names (a link count greater than 1), the multiple names each lead to the same inode number, but those multiple names may appear in any directory anywhere in the disk partition:

$ rm -rf new ; mkdir -p new/dir
$ date >new/a ; date >new/b ; date >new/c
$ ln new/a new/dir/x
$ ln new/b new/dir/y
$ ln new/c new/dir/z

$ find new -type f -ls
3138990 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/c
3138989 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/b
3138988 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/a
3138990 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/z
3138989 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/y
3138988 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/x

(Recall that the -ls expression to find, instead of -print, shows the same detailed output as you would see from ls -dils. We also use the -type f expression to limit output to only files.)

Above, the file names with the same inode number do not appear together in the output, because they are in different directories. It is hard to notice that the two names new/c and new/dir/z have the same inode numbers and so must be the same file, though we can see that they both have a link count of 2. The solution is to use find and sort to make inode numbers sort together on your screen, to make finding duplicates by eye easier:

$ find new -type f -ls | sort
3138988 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/a
3138988 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/x
3138989 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/b
3138989 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/y
3138990 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/c
3138990 4 -rw-r--r--   2 idallen  idallen        29 Oct 21 13:47 new/dir/z

Now, the same inode numbers sort together on your screen and it is easier to see the duplicate inode numbers and know which files are hard links to each other. We can easily see that new/c and new/dir/z have the same inode number and a link count of 2, so they must be the only two names for this file inode.

Since every file under new/dir/ has a link count of 2 and a second name in another directory, removing all the names in new/dir, as in rm new/dir/*, would reclaim no disk space. All that would happen is that the link count of each file would go down from 2 to 1. To reclaim the disk space, you have to remove all the names of the files, including the names that are in the other directory new/[abc]:

$ du -s new
20      new                       # see how much space is in use
$ rm new/dir/*                    # remove all those file names
$ du -s new
20      new                       # no change in space used !
$ rm new/[abc]                    # remove the other names for the files
$ du -s new
8       new                       # only now is the space reclaimed

You can also use an expression to find to find files by inode number or by link count (RTFM). Newer versions of find even have a -samefile expression to find files with the same inode number as a given file. These expressions are very useful to find all the names of files that have more than one link.

Names for files can appear in any directory, so if you don’t really know where all the names for a file are, you may have to search every directory in the entire file system! This can take a long time. Usually you have some idea where the other name(s) might be, so you don’t have to search everything.

Author: 
| Ian! D. Allen, BA, MMath  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

Plain Text - plain text version of this page in Pandoc Markdown format

Campaign for non-browser-specific HTML   Valid XHTML 1.0 Transitional   Valid CSS!   Creative Commons by nc sa 3.0   Hacker Ideals Emblem   Author Ian! D. Allen