Hard links and Unix/Linux file system index nodes (inodes)

Ian! D. Allen – www.idallen.com

Fall 2016 - September to December 2016 - Updated 2017-11-27 01:36 EST

1 OverviewIndexup to index

A feature of most Unix/Linux file systems is the ability to have two or more names for the same file data. These names are called hard links.

1.2 File data is stored by number in inodesIndexup to index

To make multiple names possible, in Unix/Linux the file data is stored in a different place than the file name. The file data is given a unique number, called an index node or inode number, and the file data is stored on disk using only that number (not the name).

The names of files are stored on disk in a separate directory. Each file name is paired with just the inode number that matches the file data that goes with that name. The data is not stored with the name; only the inode number of the data is stored with the name. This makes it easy to have multiple names refer to the same inode number.

The ls command with the -i option will show you at the start of each output line the inode number paired with each file system name. The -d option to ls shows the actual directory permissions, not the directory contents:

$ ls -ild /bin /bin/bz*
711 drwxr-xr-x 2 root adm   4096 Jan 20 02:49 /bin/
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzcat
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzip2
444 -rwxr-xr-x 1 root root  2140 Dec 15  2011 /bin/bzdiff
[...]

We see above that directory /bin has its own inode number 711. The file names /bin/bzcat and /bin/bzip2 are both paired with inode 601 and the file name /bin/bzdiff is paired with inode 444.

1.3 Two names – same inodeIndexup to index

The matching leading 601 inode numbers above show that the pathnames /bin/bzcat and /bin/bzip2 are not two different files. These are two names for the same 601 inode – two names for the same 601 file data. The file data is stored on disk just once under inode 601 and you can access this 601 inode data using either the pathname /bin/bzcat or the pathname /bin/bzip2. If you edit /bin/bzcat, you are editing the data stored in inode 601, and the changes will also be there when you access the same inode 601 using the alternate pathname /bin/bzip2. It’s the same as one person being called both “Bob” and “Robert” – both names lead to the same man; they are two names for the same thing.

2 Files and directories are inodesIndexup to index

To understand how Unix hard links and link counts work, it helps to understand how files and directories are stored in Unix file systems.

On Unix, the collection of data that makes up the contents of a directory or a file isn’t stored directly under a name; the data is stored as part of a numbered data structure called an “index node” or “inode”.

In the inode, Unix stores information about which disk blocks are allocated to store the contents of the directory or file, as well as attribute information about who owns the directory or file and what access permissions it has. Every directory or file on Unix has its storage and attributes managed by its own numbered inode.

Unix doesn’t store the name of the directory or file in the inode. Every inode in the file system has a unique number, and the file system locates the contents of a directory or file strictly by its inode number.

3 Directories give names to inodesIndexup to index

Directories give names to inode numbers. When you access a file by name, the system looks in a directory to see what inode number corresponds to that name, then the system goes to that inode to read the file data for that file name. There is no file data stored with the name in the directory, only an inode number.

Unix directories are very simple – they are just tables that pair names with inode numbers. Given a name, the directory will tell the system the inode number that corresponds to that name, and that is all. To read the actual data that corresponds to that name, the system still has to look in the file system to find the data for that inode (by number), and then read the data from that inode.

The inodes that contain directories and the inodes that contain files are similar. Directories are simply special kinds of inodes whose data blocks are tables of name-to-inode pairings. On some older versions of Unix (e.g. IBM AIX) you can even open a directory as if it were a regular file and read the binary name-inode tables directly!

4 The name is separate from the dataIndexup to index

Because names are stored separately from data, the name of a file or directory is stored in a directory inode that is different from the inode that actually holds the contents of that file or directory. The name-inode pairing information is stored in a directory inode; the actual data content of the directory or file is stored in its own separate inode. Recall this listing from above (note the use of the -d option to ls to show the actual directory permissions, not the directory contents):

$ ls -ild /bin /bin/bz*
711 drwxr-xr-x 2 root adm   4096 Jan 20 02:49 /bin/
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzcat
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzip2
444 -rwxr-xr-x 1 root root  2140 Dec 15  2011 /bin/bzdiff
[...]

The /bin directory inode 711 is where the names bzcat, bzip2, and bzdiff are actually stored – names are stored in directories. The file inodes 601 and 444 are where the content of the files is stored. The names are kept in a different place from the content.

The name-inode pairing for a file isn’t necessarily unique. A directory may contain several file names that are all paired with the same inode number; using any of the several names will lead to the same inode number and thus to the same file in the file system; these are called “hard links”.

The names bzcat and bzip2 in the /bin directory above are paired with the same 601 inode number, so these two names are really two names (hard links) for the same data inode 601.

Different directories may also contain name-inode pairings for the same inodes. It might be true that /etc/foo – in a different directory than /bin – might be a third name for inode 601:

$ ls -ild /etc/foo
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /etc/foo

Thus, on Unix, a file (the data in a file) can have many names, even across different directories. You may use any of the several names of a file to find the unique data inode for the file. Unix calls these multiple names for the same inode “hard links” to the file.

5 File attributes are kept with the data inodeIndexup to index

File attributes such as owner, modify time, and permissions are stored as part of the inode that contains the data for the file. This means it doesn’t matter what name you use to access the file data, you always arrive at the same inode and the same owner, modify time, and permissions. All that information is kept with the file data inode, not with any of the file names.

This means that even if you have a name for one of my files in one of your directories, it doesn’t give you any permissions to modify my file since the permissions to access and change my file are kept with the file inode that belongs to me. All you have is a name; nothing more.

Having a name for a file does not necessarily give you any permissions to read or change the file. To change data in a file, you must have permissions to modify the actual inode containing the data for the file. The location of the name for the file doesn’t matter.

5.1 Directories holding file names have different permissions than file dataIndexup to index

The names of files are kept in directories that are separate from the file data. Directories have their own storage inodes, and so the directory containing a name of a file may have a different set of attributes than the inode that contains the file data itself. Revisiting our example above (note the use of the -d option to ls to show the actual directory permissions, not the directory contents):

$ ls -ild /bin /bin/bz*
711 drwxr-xr-x 2 root adm   4096 Jan 20 02:49 /bin/
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzcat
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzip2
444 -rwxr-xr-x 1 root root  2140 Dec 15  2011 /bin/bzdiff
[...]

The directory /bin (inode 711) that holds the names bzcat, bzip2, and bzdiff has different attributes than the inodes of the files whose names it holds. The attributes that apply to the directory holding the file names are not the same as the attributes that apply to the file content for those names.

Changing the attributes of the directory that contains a name-inode pairing for the file doesn’t affect the owner and permissions of the file content itself. The directory containing the name and the file content are in different inodes.

The permissions you need to change the name of a file are contained in the inode corresponding to the directory that contains the name. The permissions you need to change the data in the file itself are contained in the inode corresponding to the file data. These are two different inodes, and each inode may be owned and controlled by a different userid. An example:

555 -rw-r--r-- 3 chris chris 12 Dec  5  2013 /home/pat/foo
555 -rw-r--r-- 3 chris chris 12 Dec  5  2013 /home/leslie/foo
555 -rw-r--r-- 3 chris chris 12 Dec  5  2013 /home/chris/foo

Users Pat and Leslie can each own directories (their HOME directories) containing a name-inode pairing (a name) for a file whose actual data inode is owned by Chris. Chris retains full control of access to the file data, because the inode containing the data is owned by him. Pat and Leslie can’t write onto the file’s data inode; they don’t have permission.

Pat and Leslie can only manipulate their local name-inode pairings in their own directories that they own. They can change their own names for Chris’ file in their own directories. They can create new names (hard links) for the file, change their existing names for the file, or remove their names for the file, but they can’t affect the file data itself, unless file owner Chris changes the permissions to permit it.

Even if both Pat and Leslie remove their names for this file, Chris still has a name for it and the file continues to exist.

A name is stored separately from content, and you need one set of permissions to manipulate the directory inode that contains the name and a different set of permissions to manipulate the inode that is the content.

7 Linking, Moving, and Removing are directory operations: ln, mv, rmIndexup to index

Anyone can normally create a hard link (ln) to any file to which they have access (except see the earlier note on added kernel security options that may prevent this). They don’t need to be able to read or write the file data to make the link; they only need to be able to access the inode of the file and they need write permission on the directory in which the name-inode pairing (the name, or “link”) is being created. (Also, unrelated to permissions, the inode of the thing to which the link is being made must be on the same file system as the directory that will contain the link. See “File Systems”, below.)

Anyone can rename (mv) a file name if they can write on the directory holding the file name. Moving a file name doesn’t change the data; it only moves the name for the file data. The inode number stays the same after the move (unless you are moving the file between file systems). The data itself does not move and is not touched; only the name moves.

Anyone can remove (rm) a file name if they can write on the directory holding the file name. Removing a name doesn’t change the data; it only decreases the link count of the data inode by one. (Only when the link count goes to zero does the file data get released.)

Linking, moving, and removing are all purely directory operations. They only affect file names; they don’t touch or affect the data. You don’t need any permissions on the data to link, move, or remove a file name.

10 Example – Files, Directories, and InodesIndexup to index

Example – Files, Directories, and Inodes

11 Many Names, Same InodeIndexup to index

Here is our earlier example again;

$ ls -ild /bin /bin/bz*
711 drwxr-xr-x 2 root adm   4096 Jan 20 02:49 /bin/
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzcat
601 -rwxr-xr-x 3 root root 31112 Dec 15  2011 /bin/bzip2
444 -rwxr-xr-x 1 root root  2140 Dec 15  2011 /bin/bzdiff
[...]

The inode numbers show which names are links to the same content. In the example above, two file names in the /bin directory are paired with the same inode number 601, and the link count on this inode is three. This is one file (one inode) with three names, two of which are visible in the listing above. There is a third name somewhere.

The only way you can tell which directory names point to the same files is by looking at the inode numbers using the -i option to ls.

If you know one name for an inode and want to find the other names, you have to use the super-user account and walk the entire file system looking for the inode number (don’t do this on a shared machine; it’s very disk intensive):

# find / -xdev -inum 601 -print     # won't work unless super-user
/bin/bzcat
/bin/bzip2
/bin/bunzip2

The above find command walks the entire file system, looking for names that are paired with inode 601. Doing this is both slow and disk-intensive. Some new versions of find have a -samefile option to do the same thing without having to look up the inode number first. It’s still a very slow and disk-intensive operation. Don’t do it unless you have to.

12 File System Damage – Orphans and Lost+FoundIndexup to index

When a Unix file system suffers damage, one or more inodes may become unreadable. This usually has one of two consequences:

At boot time, the Unix file-system checking program, fsck, notices the existence of files and sub-directories that have a positive link count but no longer have any names in any undamaged directories in the file system. (On an undamaged file system, an inode with a positive link count will always have that many names in directories somewhere in the file system.) These are “orphan” inodes – they have data but no name(s).

The fsck program gives the orphan inodes false names (most Unix systems create new names that are just the inode numbers) and links them into a special directory named /lost+found when the system reboots itself. The system administrator must go into the directory and figure out what the files are, what their names are, and where they belong.

Note that only the names are lost when a directory is damaged. The owner, permissions, and access times are all stored with the data, and are not lost. Only the names are lost.

You can read about a file system autopsy after a small I/O error on a disk.

13 Many File Systems – one ROOTIndexup to index

A Unix file system is equivalent to a single disk partition (e.g. a “drive letter” in Windows). Each Unix file system (disk partition) has its own set of inode numbers, and you can’t make hard links between different file systems. Since the overall hierarchical tree on a Unix system may transparently include pieces from several file systems (from several partitions) all connected together, some items in the tree will have the same inode numbers, but will actually be different files residing in different file systems:

$ df / /home
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1       29528020 5458708  22569364  20% /
/dev/sdc1      516060600 3008500 486837752   1% /home

$ ls -lid /lost+found /home/lost+found
11 drwx------ 2 root root 16384 Sep  7  2012 /lost+found
11 drwx------ 2 root root 16384 May 19  2013 /home/lost+found

Above, we see that the ROOT file system is on disk partition /dev/sda1 and the /home directory is actually another file system (another disk partition) /dev/sdc1. The directories /lost+found and /home/lost+found have the same inode numbers, but they are in different file systems.

A directory’s name-to-inode-number pairing applies only within a single Unix file system. It isn’t possible for a directory to pair a name with (hard link) an inode number in a different file system (i.e. in a different disk partition). Inodes are only unique within one disk partition.

The df command can show on which disk partition (file system) a file or directory resides:

$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              1511856   1121408    313648  79% /
/dev/sdb1               198337      5663    182434   4% /mnt/ext3
/dev/sdb2                99150      5646     88384   7% /mnt/ext4

$ df /etc/passwd
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              1511856   1121408    313648  79% /

$ df /mnt/ext3/foo
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb1               198337      5663    182434   4% /mnt/ext3

Using the above listing, we can see that the system has three file systems mounted from three different disk partitions. Pathnames under /, /mnt/ext3, and /mnt/ext4 will have some of the same inode numbers; but, being on different file systems, they are really different files. You can’t make hard links between any of these three file systems; you can only make hard links inside the file systems.

The -i option to the ls command shows which inode number is associated with a file or directory:

$ ls -lid / /mnt/ext?
2 dr-xr-xr-x  22 root root 4096 Nov 22 05:03 /
2 drwxr-xr-x   3 root root 1024 Nov 24 12:21 /mnt/ext3
2 drwxr-xr-x   3 root root 1024 Nov 24 12:34 /mnt/ext4

We see that the ROOT of every Linux file system is inode 2, and each inode 2 above is a different inode 2 since each is on a different file system residing in a different partition.

If you want to create a link from one file system to another, you can’t use hard links but you can use symbolic links:

$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              1511856   1121408    313648  79% /
/dev/sdb1               198337      5663    182434   4% /mnt/ext3
/dev/sdb2                99150      5646     88384   7% /mnt/ext4

$ ln /mnt/ext3/foo /mnt/ext4/bar
ln: failed to create hard link `/mnt/ext4/bar' => `/mnt/ext3/foo': Invalid cross-device link
$ ln -s /mnt/ext3/foo /mnt/ext4/bar
$ ls -l /mnt/ext4/bar
lrwxrwxrwx 1 alleni alleni 13 Mar  5 07:35 /mnt/ext4/bar -> /mnt/ext3/foo

Campaign for non-browser-specific HTML   Valid XHTML 1.0 Transitional   Valid CSS!   Creative Commons by nc sa 3.0   Hacker Ideals Emblem   Author Ian! D. Allen