Updated: 2017-01-20 00:48 EST

1 Introduction: Files and Inodes

In Unix/Linux, a file is a sequence of bytes without structure. Any necessary structure (e.g. for a database) is added by the programs that manipulate the data in the file. Linux itself doesn’t know about the internal structure of a database file – all it does is return bytes.

1.1 Even hardware devices have file names

Unix/Linux tries its best to treat every device attached to it as if it were a list of bytes. Therefore, everything, including network cards, hard drives, partitions, keyboards, printers, and plain files are treated as file-like objects and each has a name in the file system.

$ ls -li /dev/mem /dev/sda /dev/tty1
5792 crw-r----- 1 root kmem 1, 1 Oct 13 02:30 /dev/mem
 888 brw-rw---- 1 root disk 8, 0 Oct 13 02:30 /dev/sda
5808 crw-rw---- 1 root tty  4, 1 Oct 13 02:31 /dev/tty1

Most input and output devices and directories are treated as files in Linux. If you have sufficient permissions, you can directly read all these devices using their file system names. Recent versions of Unix/Linux have evolved directories into non-readable (non-file) objects.

1.2 Index Nodes = Inodes

As with most things computer-related, things in the file system are not stored by name, they are stored by number. Linux stores the data and information about each disk object (e.g. a file or a directory) in a numbered data structure called an “index node” or inode.

Each inode is identified by a unique inode number that can be shown using the -i option to the ls command:

$ ls -l -i /usr/bin/perl*
266327 -rwxr-xr-x 2 root root 10376 Mar 18  2013 /usr/bin/perl
266327 -rwxr-xr-x 2 root root 10376 Mar 18  2013 /usr/bin/perl5.14.2
266331 -rwxr-xr-x 2 root root 45183 Mar 18  2013 /usr/bin/perlbug
266328 -rwxr-xr-x 1 root root   224 Mar 18  2013 /usr/bin/perldoc
266329 -rwxr-xr-x 1 root root   125 Mar 18  2013 /usr/bin/perldoc.stub
266330 -rwxr-xr-x 1 root root 12318 Mar 18  2013 /usr/bin/perlivp
266331 -rwxr-xr-x 2 root root 45183 Mar 18  2013 /usr/bin/perlthanks

The program /usr/bin/perl, above, is not stored on disk with its name perl; it is stored somewhere else, under inode number 266327. Unix/Linux directories are what map file system names (e.g. perl) to inode numbers (e.g. 266327). In the example above, you can see that file /usr/bin/perl is really inode number 266327 (and that another name perl5.14.2 leads to the same inode!). When you access the perl program, the system finds the perl name in a directory, paired with the inode number 266327 that holds the actual data, and then the system has to go elsewhere on disk to that inode number to access the data for the perl program. File data is stored under inode numbers, not under names.

Every file has its name entered in a directory and is assigned a unique inode number. Each file name can be mapped to only one single inode number, but one inode number may have many names (as is the case with perl, above).

Inode numbers are specific to a file system inside a disk partition. Every file on a file system (in that partition) has a unique inode number. Numbering is done separately for each file system, so different disk partitions may have file system objects with the same inode numbers.

Every Linux file system is created new with a large set of available inodes. You can list the free inodes using df -i. Older types of file systems can never make more inodes, even if there is lots of disk space available; when all the inodes are used up, the file system can create no more files until some files are deleted to free some inodes.

2 File System Diagrams

Most diagrams showing file systems and links in Unix texts are wrong and range from confusing to seriously misleading. Here’s the truth, complete with an ASCII-art file system diagram below.

Names for inodes (names for files, directories, devices, etc.) are stored on disk in directories. Only the names and the associated inode numbers are stored in the directory; the actual disk space for whatever data is being named is stored in the inode, not in the directory. The names and numbers are kept in the directory; the names are not kept with the data.

In the directory, beside each name, is the index number (inode number) indicating where to find the disk space used to actually store the thing being named. You can see this name-inode pairing using ls -i:

$ ls -i /usr/bin/perl*
266327 /usr/bin/perl        266329 /usr/bin/perldoc.stub
266327 /usr/bin/perl5.14.2  266330 /usr/bin/perlivp
266331 /usr/bin/perlbug     266331 /usr/bin/perlthanks
266328 /usr/bin/perldoc

The crucial thing to know is that the names and the actual storage for the things being named are in separate places. Most texts make the error of writing Unix file system diagrams that put the names right on the things that are being named. That is misleading and the cause of many misunderstandings about Unix/Linux files and directories. Names exist one level above (separate from) the items that they name:

WRONG - names on things      RIGHT - names above things
=======================      ==========================
                                                      
    R O O T            --->         [etc,bin,home]   <-- ROOT directory
   /   |   \                         /    |      \
etc   bin   home       --->  [passwd]  [ls,rm]  [abcd0001]
 |   /   \    \                 |      /    \       |
 |  ls   rm  abcd0001  --->     |  <data>  <data>  [.bashrc]
 |               |              |                   |
passwd       .bashrc   --->  <data>                <data>

Directories are lists of names and numbers, as shown by the square-bracketed lists in the diagram on the right, above. (The actual inode numbers are omitted from this small diagram.) The name of each thing (file, directory, special file, etc.) is kept in a directory, separate from the storage space for the thing it names. This allows inodes to have multiple names and names in multiple directories; all the names can refer to the same storage space by using the same inode number.

In the correct diagram on the right, the directories give names to the objects below them in the tree. The top directory on the right is the ROOT directory inode, containing the list of names etc, bin, and home (and others). Because there is no name level above the ROOT directory to give it a name, the ROOT directory has no name!

The line leading downwards from the name bin in the ROOT directory indicates that the name bin is paired with an inode number that is another directory inode containing the list of names in the bin directory, including names ls and rm (and others). The line leading down from ls in the bin directory inode leads to the data inode for the file /bin/ls. There is no name kept with the data inode – the name is up in the directory above it.

The ROOT inode has no name because there is no directory above it to give it one! Every other directory has a name because there is a directory inode above it that contains its name.

3 Inodes manage disk blocks

The actual data for each Unix file or directory stored on disk is managed by numbered on-disk data structures called “inodes” (index nodes). One inode is allocated for each file and each directory. Unix inodes have unique numbers, not names, and it is these numbers that are kept in directories alongside the names. The -i option to ls shows these inode numbers.

A Unix inode manages the disk storage space for a file or a directory. The inode contains a list of pointers to the disk blocks that belong to that file or directory. The larger the file or directory, the more disk block pointers it needs in the inode. Also stored in the inode are the attributes of the file or directory (permissions, owner, group, size, access/modify times, etc.); but, not the name of the file or directory. Inodes have only numbers, attributes, and disk blocks – not names. The names are kept separately, in directories.

Everything in a Unix file system has a unique inode number that manages the storage for that thing: every file, directory, special file, etc. Files and directories are both managed with inodes.

4 Directory inodes hold all the names

File system names are stored in directory inodes. The names are not kept in the same inodes with the things that they name. The name of a file or directory is not kept in the inode with the file attributes or pointers to disk blocks; the name is kept in a directory somewhere else.

Directories are what give names to inodes on Unix. Directories can be thought of as “files containing lists of names and inode numbers”. Files have disk blocks containing file data; directories also have disk blocks; but, the blocks contain lists of names and inode numbers.

Like most other inodes, directory inodes contain attribute information about the inode (permissions, owner, etc.) and one or more disk block pointers in which to store data; but, what is stored in the disk blocks of a directory is not file data but directory data (names and inode numbers).

A Unix directory is simply a list of pairs of names and associated inode numbers. That is all – the disk blocks of Unix directories contain only names and inode numbers. The rest of the attribute information about an item named in a directory (the type, permissions, owner, etc.) is kept with the inode associated with the name. You must use the inode number from the directory to find the inode on disk to read its attribute information; reading the directory only tells you the name and inode number. (Some modern Unix/Linux file systems also cache a second copy of the inode type in the directory to speed up common file system browsing operations.)

Reading a Unix directory tells you only some names and inode numbers; you know nothing about the types, sizes, owners, or modify times of those inodes unless you actually go out to the separate inode on disk and access them to read the attributes. Without actually accessing the inode, you can’t know most of the attributes of the file system object; you can’t even know if the inode is a file inode or a directory inode.

To find out attribute information of some file system object, which is stored with the inode, not in the directory, you must first use the inode number associated with the object to find the inode of the item and look at the item’s attributes. This is why ls or ls -i are much faster than ls -l:

  1. ls or ls -i only need to read the names and inode numbers from the directory – no additional inode access is needed because no other attributes are being queried. Reading the one directory inode is sufficient.
  2. ls -l has to display attribute information, so it has to do a separate inode lookup to find out the inode attribute information for every inode in the directory. A directory with 100 names in it requires 100 separate inode lookups to fetch the attributes.

No attribute information about the things named in the directory is kept in the directory (except on those modern file systems where caching is enabled). The directory only contains pairs of names and inode numbers.

To find a thing by name, the system goes to a directory inode, looks up the name in the disk space allocated to that directory, finds the inode number associated with the name, then goes out to the disk a second time and finds that inode on the disk. If that inode is another directory, the process repeats from left-to-right along the pathname until the inode of the last pathname component (on the far right in the pathname) is found. Then the disk block pointers of that last inode can be used to find the data contents of the last pathname component.

(The storage for each directory is itself managed by an inode, so the inode for the directory itself contains attribute information about the directory, not about the things named in the directory. Use ls -ld to see the attributes of the directory inode itself.)

4.1 Damaged directories create orphans

The name and inode number pairing in a Unix directory is the only connection between a name and the thing it names on disk. The name is kept separate from the data belonging to the thing it names (the actual inode on disk). If a disk error damages a directory inode or the directory disk blocks, file data is not usually lost; since, the actual data for the things named in the directory are stored in inodes separate from the directory itself. If a directory is damaged, only the names of the things are lost and the inodes become “orphan” inodes without names. The storage used for the things themselves is elsewhere on disk and may be undamaged. You can run a file system recovery program such as fsck to recover the data (but not the names).

The name of an item (file, directory, etc.) and its inode number are kept in a directory. The directory storage for that name and number is managed by its own inode that is separate from the inode of each thing in the directory. The name and number are stored in the directory inode; the data for the item named is stored in its own inode somewhere else.

6 Tracing Inodes in Pathnames

When you look at a Unix pathname, remember that that the slashes separate names of pathname components. All the components to the left of the rightmost slash must be directories, including the “empty” ROOT directory name to the left of the leftmost slash. For example:

/home/alex/foobar

In the above example, there are three slashes and therefore four pathname components. The “empty” name in front of the first slash is the name of the ROOT directory. The ROOT directory doesn’t have a name. (Some books get around this by calling the ROOT directory “slash” or /. That is wrong. ROOT doesn’t have a name – slashes separate names.)

  1. Inside the ROOT directory is the name of the home directory.
  2. Inside the home directory is the name of the alex directory.
  3. Inside the alex directory is the name of the foobar file.

The last (rightmost) component of a pathname can be a file or a directory (or other); for this example, let’s assume foobar is a file name.

Below is a file system diagram written correctly, with the names for things shown one level above the things to which the names actually refer. Each box represents an inode; the inode numbers for the box are given beside the box, on the left. Inside the directory inodes you can see the pairing of names and inode numbers. (These inode numbers are made up – see your actual Unix system for the real inode numbers.) One of the inodes, #12, is not a directory; it is an inode for a file and contains the file data. The downward arrows trace two paths (hard links) to the same #12 file data, /home/alex/foobar and /home/alex/literature/barfoo:

We will trace the inodes for two pathnames in the diagram below:

  1. /home/alex/foobar
  2. /home/alex/literature/barfoo

Follow the downward-pointing arrows:

    +----+-----+-----------------------------------------+
#2  |. 2 |.. 2 | home 5 | usr 9 | tmp 11 | etc 23 | ...  |
    +----+-----+-----------------------------------------+
                  |  The inode #2 above is the ROOT directory. It has the
                  |  name "home" in it. The *directory* "home" is not
                  |  here; only the *name* is here. The ROOT directory
                  |  itself does not have a name!
                  V
    +----+-----+---------------------------------------------------+
#5  |. 5 |.. 2 | alex 31 | leslie 36 | pat 39 | abcd0001 21 | ...  |
    +----+-----+---------------------------------------------------+
                  |  The inode #5 above is the "home" directory. The name
                  |  "home" isn't here; it's up in the ROOT directory,
                  |  above. This directory has the name "alex" in it.
                  V
    +----+-----+---------------------------------------------------+
#31 |. 31|.. 5 | foobar 12 | temp 15 | literature 7 | demo 6 | ... |
    +----+-----+---------------------------------------------------+
                  |  The inode #31 above is     |
                  |  the "alex" directory. The  |
                  |  name "alex" isn't here;    |
                  |  it's up in the "home"      |
                  |  directory, above.  This    |
                  |  directory has the names    |
                  |  "foobar" and "literature"  |
                  |  in it.                     |
                  |                             V
    +----+-----+--|-------------------------------------------+
#7  |. 7 |.. 31|  |  barfoo 12 | morestuf 123 | junk 99 | ... |
    +----+-----+--|-------------------------------------------+
                  |       |  The inode #7 above is the "literature" directory.
                  |       |  The name "literature" isn't here; it's up
                  |       |  in the "alex" directory.  This directory has
                  |       |  the name "barfoo" in it.
                  |       |
                  V       V
                 *-----------*  This inode #12 on the left is a file inode.
                 | file data |  It contains the data blocks for the file.
             #12 | file data |  This file happens to have two names, "foobar"
                 | file data |  and "barfoo", but those names are not here.
                 *-----------*  The names of this file are up in the two
                                directories that point to this file, above.

The pathname /home/alex/foobar starts at the nameless ROOT directory, inode #2. It travels through two more directory inodes and stops at file inode #12. Using all four inode numbers, /home/alex/foobar could be written as #2->#5->#31->#12.

The pathname /home/alex/literature/barfoo starts at the ROOT inode and travels through three more directory inodes. It stops at the same #12 file inode as /home/alex/foobar. Using all five inode numbers, /home/alex/literature/barfoo could be written as #2->#5->#31->#7->#12.

Thus, /home/alex/foobar and /home/alex/literature/barfoo are two pathnames leading to the same inode #12 file data. The names foobar and barfoo are two names for the same file and are called “hard links”.

7 Tracing Pathname 1: /home/alex/foobar

Let’s examine each of the above inodes.

The box below represents the layout of names and inode numbers inside the actual disk space given to the nameless ROOT directory, inode #2:

    +----+-----+-----------------------------------------+
#2  |. 2 |.. 2 | home 5 | usr 9 | tmp 11 | etc 23 | ...  |
    +----+-----+-----------------------------------------+

The above ROOT directory has the name home in it, paired with inode #5. The actual disk space of the directory home is not here; only the name home is here, alongside of its own inode number #5. To read the actual contents of the home directory, you have to find the disk space managed by inode #5 somewhere else on disk and look there.

The above ROOT directory pairing of home with inode #5 is what gives the home directory its name. The name home is separate from the disk space for home. The ROOT directory itself does not have a name; because, it has no parent directory to give it a name!

The ROOT directory is the only directory that is its own parent. If you look at the ROOT directory above, you will see that both the name . and the name .. in this ROOT directory are paired with inode #2, the inode number of the ROOT directory. Following either name . or .. will lead to inode #2 and right back to this same ROOT inode.

Let us move to the storage space for the home directory at inode #5.

The box below represents the layout of names and inode numbers inside the actual disk space given to the home directory, inode #5:

    +----+-----+---------------------------------------------------+
#5  |. 5 |.. 2 | alex 31 | leslie 36 | pat 39 | abcd0001 21 | ...  |
    +----+-----+---------------------------------------------------+

The name home for this inode isn’t in this inode; the name home is up in the ROOT directory. This home directory has the name alex in it, paired with inode #31. The directory alex is not here; only the name alex is here. To read the alex directory, you have to find inode #31 on disk and look there. (In fact, until you look up inode #31 and find out that it is a directory, you have no way of even knowing that the name alex is a name of a directory!)

Let us move to the storage space for the alex directory at inode #31.

The box below represents the layout of names and inode numbers inside the actual disk space given to the alex directory, inode #31:

    +----+-----+---------------------------------------------------+
#31 |. 31|.. 5 | foobar 12 | temp 15 | literature 7 | demo 6 | ... |
    +----+-----+---------------------------------------------------+

The name alex for this inode isn’t in this inode; the name alex is up in the home directory. This alex directory has the name foobar in it, paired with inode #12. The file foobar is not here; only the name foobar is here. To read the data from file foobar, you have to find inode #12 on disk and look there. (In fact, until you look up inode #12 and find out that it is a plain file, you have no way of even knowing that the name foobar is a name of a plain file!)

Let us move to the storage space for the foobar file at inode #12.

The box below represents the actual disk space given to the foobar file, inode #12:

    *-----------*
#12 | file data |
    *-----------*

The name foobar for this inode isn’t in this inode; the name foobar is up in the alex directory. This foobar inode is a file inode, not a directory inode, and the attributes of this inode will indicate that.

The inode for a file contains pointers to disk blocks that contain file data, not directory data. There are no special directory names . and .. in files. There are no names here at all; the disk block pointers in this inode point to just file data (whatever is in the file).

This completes the inode trace for /home/alex/foobar: #2->#5->#31->#12

8 Tracing Pathname 2: /home/alex/literature/barfoo

Let’s now trace the inode path for the name /home/alex/literature/barfoo. This pathname is a “hard link” to /home/alex/foobar; both the foobar and barfoo names point to the same inode number. Let’s see how:

The trace from ROOT through /home/alex is the same as before. Things change in our second trace because of /home/alex/literature. If we look at the alex directory inode #31 we see that the name literature is paired with inode #7:

    +----+-----+---------------------------------------------------+
#31 |. 31|.. 5 | foobar 12 | temp 15 | literature 7 | demo 6 | ... |
    +----+-----+---------------------------------------------------+

The alex directory inode #31 above says to follow the trail to the literature name we must go to inode #7. (We won’t know whether the #7 inode for literature is a file or a directory until we get there!)

The box below represents the layout of names and inode numbers inside the actual disk space given to the literature directory, inode #7, which turns out to be a directory:

    +----+-----+---------------------------------------------+
#7  |. 7 |.. 31|    barfoo 12 | morestuf 123 | junk 99 | ... |
    +----+-----+---------------------------------------------+

The name literature for this inode isn’t in this inode; the name literature is up in the alex directory inode #31. This literature directory inode #7 has the name barfoo in it, paired with inode #12. The actual data for the thing that is barfoo is not here; only the name barfoo is here. You will recall that we have seen inode #12 in the previous trace.

Above, in the alex directory (inode #31), inode #12 was also paired with the name foobar. In the literature directory (inode #7), inode #12 is paired with the name barfoo. Inode #12 has two different names; names foobar and barfoo are both hard links to the same inode #12:

$ ls -i /home/alex/foobar /home/alex/literature/barfoo
12 /home/alex/foobar   12 /home/alex/literature/barfoo

Two names means the “link count” of inode #12 is set to “two”. Both names lead to the same #12 inode and thus to the same data and same attributes. This is one single file with two names. A change to the file data using the name foobar changes the data in inode #12. That changes file data for the name barfoo too; because, foobar and barfoo are two names for the same #12 inode storage – they are two names that point to the same storage inode.

Everything about data inode #12 except its name is kept with the inode. The only thing different in a long listing of foobar and barfoo will be the names; everything else (file type, permissions, owner, group, link count, size, modification times, etc.) is part of inode #12 and must therefore be identical for the two names. Neither name is more “original” than the other; both names have equal status. To release the #12 inode storage, you have to delete both names (so the link count drops to zero).

9 Path Traversal

Let’s use the above inode data to follow a valid path such as:

/home/alex/literature/barfoo

Start on the left and walk the tree to the right. To be a valid Unix path, everything to the left of the rightmost slash must be a directory. (Thus, ROOT, home, alex, and literature must be directories, if this is a valid pathname.)

Start with the nameless ROOT directory in front of the first slash (ROOT doesn’t have a name, since it does not appear in any parent directory) and look for the first pathname component (home) inside that directory (inside inode #2).

Let’s trace the pathname:

Look in the ROOT directory (located in inode #2) for the name of the first pathname component: home. We find the name home inside the ROOT directory, paired with inode #5. Go back out to the disk to find inode #5 that is the actual home directory.

Note how the names are separate from the things they name. The actual directory inode #5 of the home directory is not the same as the inode #2 of the ROOT directory that contains the directory name home. The name is stored in a different place (#2) than the thing it names (#5).

In inode #5, the directory that has the name home, look for the name alex. We find alex paired with inode #31. Go back out to the disk to find inode #31 that is the actual alex directory. Again, the name alex is contained in directory inode #5 (home) and that name is stored separately from inode #31 that is the actual alex directory.

In inode #31, the directory that has the name alex, look for the name literature. We find literature paired with inode #7. Go back out to the disk to find inode #7 that is the actual literature directory. Again, the name literature is contained in directory inode #31 (alex) and that name is stored separately from the inode #7 that is the actual literature directory.

In inode #7, the directory that has the name literature, look for the name barfoo. We find it paired with inode #12. Go back out to the disk to find inode #12 that is the actual data of the file barfoo. Again, the name barfoo is contained in directory inode #7 (literature) and that name is stored separately from the inode #12 that is the actual data of the file. The name of a file is not part of the inode that makes up the actual file data.

9.1 Permissions on data vs. permissions on directories

You now have found the disk node (inode) that is your file data: inode #12. The name of this file, barfoo, is stored up in inode #7 that is the literature directory. The name is separate from the data it names.

If file data inode #12 has appropriate permission attributes, you can read or write the data in the file. It is the permission attributes on the inode containing the file data that govern what you can do with the data. The permissions on the inode of the directory containing the name of the file (directory inode #7) don’t control what you can do with the data of the file.

If the any of the inodes of the directories leading down to the file inode #12 don’t give you search permission, you won’t be able to reach the file’s data inode that way and won’t be able to access the file’s data using those directories; but, perhaps some other directories may lead you to the same inode #12, if the file has another name.

To access and read the data in a file path such as:

/home/alex/literature/barfoo

you need appropriate search permissions on the ROOT directory inode, the home directory inode, the alex directory inode, the literature directory inode, and finally read permissions on the barfoo file data inode #12.

It is the barfoo file data inode #12 permissions that determine whether or not you can read or change the data of the file. Reading or changing the data in the file requires permissions on the inode #12 that contains the data blocks of the file itself.

It is the literature directory inode permissions (inode #7) that determine what you can do with the name of the file, because the literature directory (inode #7) is where the name barfoo is kept. Changing, linking to, or removing the name of a file operates on the inode of the directory in which the file name appears; altering the name has nothing to do with reading or changing the inode that contains the data blocks of the file itself.

You can have no permissions on the inode that contains the data blocks of the file itself (it may even be owned by some other user) and still you may be able to rename or remove the name of the file from a directory on whose inode you do have permissions. The name(s) of a file is(are) stored in separate inodes from the data blocks of the file.

Names are separate from the things that they name. The permissions of the names are also separate from the permissions of the data.

Changing a name only requires write/execute permissions on a directory. No permissions are needed on the inode of the thing being renamed. Changing the content of a file only requires write permissions on the data inode of the file itself, not on the directory that holds the name of the file.

Author: 
| Ian! D. Allen, BA, MMath  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

Plain Text - plain text version of this page in Pandoc Markdown format

Campaign for non-browser-specific HTML   Valid XHTML 1.0 Transitional   Valid CSS!   Creative Commons by nc sa 3.0   Hacker Ideals Emblem   Author Ian! D. Allen