File Nodes
Updated: 2001-05-17 10:45

Unix File System Nodes (inodes)

To understand how Unix hard links and link counts work, it helps to understand how files and directories are stored in Unix file systems.

On Unix, the collection of data that makes up the contents of a directory or a file isn't stored under a name; the data is stored as part of a data structure called an "inode".

In the inode, Unix stores information about which disk blocks belong to the contents of the directory or file, as well as information about who owns the directory or file and what access permissions it has. Every directory or file on Unix has its own inode.

Unix doesn't store the name of the directory or file in the inode. Every inode in the file system has a unique number, and the file system locates the contents of a directory or file strictly by its inode number.

Unix directories are very simple - they are just tables that map directory and file names to inode numbers. Given a name, the data in the directory will tell you the inode number that corresponds to that name, and that is all. To read the data that corresponds to that name, Unix still has to look in the file system to find the inode (by number), and then use the disk block numbers contained in the inode to locate the data.

The inodes that make up directories and files look quite similar; directories are simply special kinds of files whose data blocks are tables of name-to-inode mappings. On some versions of Unix (e.g. ACADAIX) you can even open a directory as if it were a regular file and read the name-inode tables directly!

All this means that the name of a directory or file is stored separately (in a different inode) from the inode that actually points to the directory or file contents. The name-inode map information is stored in a directory inode; the actual data content of the directory or file is stored in a different inode.

The name-inode map for a file isn't necessarily unique. A directory may contain several file names that all map to the same inode number; using any of the several names will map to the same inode number and thus to the same file in the file system. Two different directories may contain identical name-inode mappings; using either pathname will lead to the same inode number and thus to the same file.

Thus, on Unix, a file can have many names, even across different directories. You may use any of the several names of a file to find the inode for the file. Unix calls these names "pointers" or "links" to the file.

Because the file's owner and permissions are stored in the inode for the file, it doesn't matter what name you use to find the file, you always arrive at the same owner and permissions. The owner and permissions of the directory that contains a name-inode map for the file doesn't affect the owner and permissions of the file content itself; because, the name-inode map is stored in a different place than the file content.

The permissions you need to change the name of a file are contained in the inode corresponding to the directory that contains the name. The permissions you need to change the data in the file itself are contained in the inode corresponding to the file. These are two different inodes, and each inode may be owned by a different userid.

Users Pat and Leslie can each own directories containing a name-inode map (a name) for a file whose inode is owned by Chris. Chris can control access to the file data. Pat and Leslie can only manipulate the name-inode maps in their directories. They can change their names for the file in their directories; but, they can't affect the file data itself.

The "ln" command will create a new name-inode map in a directory. Given an existing file name, this command will create another name that maps to the same inode number. This is called "making a hard link" to a file.

For each inode, no matter whether the inode is a directory or a file inode, a link count in the inode keeps track of how many directories contain a name-number mapping for that inode. If an inode has only one name-number map (only one name), its link count is one. If the inode has two name-number maps (two names), its link count is two.

The "rm" command will delete a name-inode map from a directory. When a link count goes to zero in an inode, that means that no directory points to the inode and Unix is free to release and reclaim the disk space used by the inode and its associated disk blocks.

Note that the "rm" command does not delete a file - it only deletes a name-inode map for a file. Only when all the name-inode maps are gone does the actual file data space get reclaimed.

Anyone can create a link to any file to which they have access. They don't need to be able to read or write the file itself to make the link; they only need to be able to access the inode of the file, and they need write permission on the directory in which the name-inode map (the name, or "link") is being created. (Also, unrelated to permissions, the inode of the thing to which the link is being made must be on the same file system as the directory that will contain the link. See "File Systems", below.)

Unix permits you to give files many names ("links"); but, not directories. You are not allowed to create a hard link to a directory. Each directory inode is allowed to appear once in exactly one parent directory and no more. This restriction means that every sub-directory only has one parent directory, and that means the special name ".." (dot dot) in a sub-directory always refers unambiguously to its unique parent directory.

(This directory linking restriction prevents loops and cycles in the file system tree, preventing cases where a sub-sub-directory might contain a link back up to a parent directory. Many things are simpler if the file system tree has no loops or cycles.)

Every directory contains the special name "." (dot), a shorthand name that is a map to the inode of the directory itself. The smallest link count of any Unix directory is therefore two: count one link for the unique name-inode map in the parent directory that gives the directory its Unix "name", and count another link for the "." (dot) map in the directory itself. Every directory must have these two names.

A directory may have sub-directories. Since the special name ".." (dot dot) in every one of those sub-directories is a link to the inode number of the parent directory, the link count of the parent directory is increased by one for every sub-directory the parent contains. (Remember - the link count counts how many name-inode maps point to this inode, and that includes the special "." (dot) and ".." (dot dot) name-inode maps!) A directory with five sub-directories will show a link count of 2+5=7.

Note that creating files in a directory does not affect the link count of the directory, since files don't create new name-inode mappings. The files themselves will have their link counts increased by one by virtue of having new name-inode maps created in this directory; but, the parent directory link count will not be affected.

Example - Files, Directories, and Inodes

Suppose the root directory has inode number #2. Here is a small part of a Unix file system tree, showing hypothetical inode numbers of some directories and files:

inode #2
`.` (dot)	2
`..` (dot dot)	2
home	123
bin	555
usr	654

inode #555
`.` (dot)	555
`..` (dot dot)	2
rm	546
ls	984
cp	333
ln	333
mv	333

inode #123
`.` (dot)	123
`..` (dot dot)	2
ian	111
stud0002	755
stud0001	883
stud0003	221

Note how one directory named bin (#555) has three name-to-number maps for the same node (#333). All three names (cp, ln, mv) refer to the same node number, in this case a file containing an executable program. (This program is one that looks at its name and behaves differently depending on which name you use to call it.)

inode #111
`.` (dot)	111
`..` (dot dot)	123
.profile	334
.login	335
.logout	433

inode #333
*Disk blocks* *for the* *cp / ln / mv* *file* *(link count: 3)*

inode #335
*Disk blocks* *for the* *.login* *file* *(link count: 1)*

Example - Many Names, Same Inode

Here are two shell programs that are each linked into different directories under different names. The only way you can tell which names point to the same program files is by looking at the inode numbers using the "-i" option to ls:

# ls -i /sbin/sh /usr/bin/sh
    136724 /sbin/sh         279208 /usr/bin/sh
# ncheck -i 279208,136724
/dev/dsk/c0t3d0s0:
279208  /usr/lib/rsh
136724  /sbin/jsh
136724  /sbin/sh
279208  /usr/bin/jsh
279208  /usr/bin/sh

The inode numbers show which names are links to the same content. The ncheck command is usable only by the Super User. It walks the entire disk partition, looks at all the directories and finds all the names that link to a particular inode. (It's both slow and expensive.)

Damage - Orphans and Lost+Found

When a Unix file system suffers damage, one or more nodes may become unreadable. If the damaged nodes are file nodes, the file data blocks pointed to by those nodes will be missing or incomplete. If any of the nodes are directory nodes, containing the names of files and sub-directories, the files and sub-directories that were once pointed to by those nodes will lose their names and become "orphans" - nodes with nothing pointing to them.

At boot time, the Unix file-system checking program, fsck, notices the existence of files and sub-directories that no longer have names. (It finds file nodes with a positive link count but can't find any directories that link to the nodes.) fsck gives the orphans false names (most Unix systems create new names that are just the node numbers) and links them into a special directory named "lost+found" when the system reboots itself. The system admin must go into the directory and figure out what the files are, what their names are, and where they belong.

Note that only the names are lost when a directory is damaged. The owner, permissions, and access times are all stored with the data, and are not lost.

Many File Systems

A Unix file system is equivalent to a single disk partition. Each Unix file system has its own set of node numbers. Since the overall hierarchical tree on a Unix system may transparently include pieces from several file systems (from several partitions), some items in the hierarchical tree will appear to have the same node numbers, but will actually be different files residing on different file systems.

A directory's name-to-number mapping applies only within a single Unix file system. It isn't possible for a directory to map to a node number in a different file system (i.e. in a different disk partition). A special "mount" command is used to splice together different file systems into one hierarchical tree.

The "-i" option to the "ls" command shows which node number is associated with a file or directory; the "df" command can show on which disk partition (file system) the file or directory resides:

$ df / /usr
Filesystem    512-blocks      Free %Used    Mounted on
/dev/hd4           49152     10280   80%    /
/dev/hd2         2236416     92928   96%    /usr

Files under / and under /usr would appear to have the same node numbers; but, being on different file systems, they are really different files.

Web Author: Ian! D. Allen idallen@idallen.ca Updated: 2001-05-17 10:45

Support free and non-commercial Internet.

This site works best in Any Browser, a campaign for non-specific WWW.

This work is licensed under a Creative Commons License.