Monday, 20 February 2006

A concise explanation of I-Nodes

At any given time, a Linux machine will be having 10's and 1000's of files including both system as well as user files. File systems like ext2 or ext3 support file names of 255 characters length and can grow in sizes of up to 2 GB. Now managing these files and keeping track of which files contain what data could be a nightmare for any OS. To overcome this logistical nightmare, Linux uses what are called i-nodes to organise block allocations to files.

Each file in Linux irrespective of its type, has a unique identity by way of an i-node number associated with it. No two different files can have the same i-node number.

So what are I-Nodes ?
I-nodes are verily data structures which hold information about a file which is unique to the file and which helps the Operating System differentiate it from other files.

For example, I have a file by name test.txt in my home directory. To know the i-node number of the file, I run the following command:

$ ls -il test.txt
148619 -rw-r--r-- 1 ravi ravi 125 2006-02-14 08:39 test.txt
As you can see, the inode number of the file is the first number in the output which is 148619. In the entire hard disk, no other file will have this number and the operating system identifies the file test.txt not by its name but by its inode number.

Suppose I create a hard link of the file test.txt as follows:

$ ln test.txt test_hardlink.txt
Will the two files test.txt and test_hardlink.txt have the same i-node number ? Lets find out.

$ ls -i test.txt test_hardlink.txt
148619 test_hardlink.txt
148619 test.txt
As you can see, both the files have the same i-node number and as far as the OS is concerned, both the files are one and the same. And if I make any changes in one of the files, then it will reflect in the other file too. And the interesting thing is if I move the file test_hardlink.txt to another location, still it will be pointing to the same file.

This brings up a security issue here. Suppose, you have a file which contains some data which you do not want another person to read without your consent. Now a person having access to this file can create a hard link to this file in another location and he will automatically be able to access the file and even see the changes that you make to the file. And even if you have deleted the said file in your directory, the file is actually not deleted as there is a file handle remaining which points to the i-node number of the file.

So system administrators are sometimes known to search out all the files pertaining to an i-node number and then delete them from the system to ensure that the file is indeed deleted. You can do it using a combination of the find and rm command as follows:

# find / -samefile test.txt -exec rm - {} \;
Or if you know the inode number of the file, then

# find / -inum 148619 -exec rm - {} \;
... which will also do the same job.

Note: Every file on the system is allocated an i-node. And there can never be more files than i-nodes.

Typically, an i-node will contain the following data about the file:
  • The user ID (UID) and group ID (GID) of the user who is the owner of the file.
  • The type of file - is the file a directory or another type of special file?
  • User access permissions - which users can do what with the file.
  • The number of hard links to the file - as explained above.
  • The size of the file
  • The time the file was last modified
  • The time the I-Node was last changed - if permissions or information on the file change then the I-Node is changed.
  • The addresses of 13 data blocks - data blocks are where the contents of the file are placed.
  • A single, double and triple indirect pointer
A point to note here is that the actual name of the file is not stored in the i-node. This is because the names of the files are stored in directories which are them selves files.

No comments:

Post a Comment