[Leaplist] directory "size", does it matter?

John Simpson jms1 at jms1.net
Sat Jan 24 17:17:47 EST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2009-01-21, at 0942, Ingo Claro wrote:
>
> I've got a basic question:
> I know that a directory is just another file and it stores  
> information regarding it's contents, so how does this directory size  
> affect performance?

when you traverse a directory, it has to read the entire contents in  
order to find the child node you're looking for.

for example, to access "/usr/bin/true", it does the following:

open root inode
read entry
read entry
read entry ... until it finds "usr"
open "/usr" inode
read entry
read entry
read entry ... until it finds "bin"
open "/usr/bin" inode
read entry
read entry
read entry ... until it finds "true"
	
the order of the entries within a directory inode are NOT sorted on  
the disk, so the only way to find a particular entry within a  
directory is to sequentially read all of the entries and stop when you  
find the one you're looking for. this usually isn't much of an issue,  
most directories only have less than a few hundred entries so a  
sequential search doesn't take very long. however if you have  
thousands of entries, and the file you're looking for within the  
directory is late in the list, then it can take a


> for example if I have a directory with many file movements (create/ 
> delete) over time I've got this:
> drwxr-x---    2 miclaro   miclaro     9678848 Jan 20 20:53 tmp

this tells me that at one time, this directory had a LOT of entries in  
it- at least 36,801 and probably a lot more than that.

i can already hear people asking where that number came from... i'm  
assuming that this is an ext3 filesystem, which means that each  
directory entry is eight bytes plus the name (so within the "/usr"  
directory, the entry for "bin" takes eleven bytes.) since the longest  
possible filename is 255 bytes, the longest possible directory entry  
is 263 bytes, and 9678848/263 is 36801 and change. however, most  
filenames aren't anywhere near that long, so it may be that this  
directory had hundreds of thousands of files in it at one point.

in fact, i just wrote a test program to see what the kernel actually  
does. it creates empty files with four-digit names, keeps track of  
what i think the byte-count should be, and actually queries the size  
of the directory after each file is created, in order to show any  
discrepancies.

what i found is that the kernel starts new directories at 4096 bytes,  
and allocates space in units of 4096 bytes (which i expected, since  
the filesystem has a 4K block size.) i also found that the first "new"  
addition to the directory occurred exactly when i thought it would-  
right when my estimated usage number (filename length plus eight  
bytes) went from "below 4096" to "above 4096".

what surprised me, however, is that the new size wasn't 8192, it was  
12288. instead of adding one block to the end of the directory, the  
kernel actually added two.

in this output, the file count is how many files my program created,  
"e=" is my estimated size, and "s=" is the size returned by stat() on  
the directory (actually perl's "-s" operator, but it's the same  
thing.) also remember that the "." and ".." entries in each directory  
ARE real entries, so a brand new "empty" directory actually uses 19  
bytes.

     1 files, e=31 s=4096
     2 files, e=43 s=4096
     3 files, e=55 s=4096
     4 files, e=67 s=4096
....
   338 files, e=4075 s=4096
   339 files, e=4087 s=4096
   340 files, e=4099 s=12288
   341 files, e=4111 s=12288

even more surprising is that it added more blocks by itself when the  
directory wasn't really full yet.

   658 files, e=7915 s=12288
   659 files, e=7927 s=12288
   660 files, e=7939 s=16384
   661 files, e=7951 s=16384
....
   707 files, e=8503 s=16384
   708 files, e=8515 s=16384
   709 files, e=8527 s=20480
   710 files, e=8539 s=20480

so it looks like if the kernel sees a directory growing, it tries to  
avoid having to suddenly find more blocks to hold the directory by pre- 
allocating more than it needs.

what does this have to do with the original question? probably not  
much, but it may be interesting to some people. (boy, when i go off on  
a tangent, i really go off on a tangent...)


> and there are no files inside.
> So it would be better to re-create this directory to reset the size?  
> or it doesn't matter?

i would say yes, for two reasons.

the first is just to reclaim the disk space. this directory is 9678848  
bytes. create a new empty directory and look at what its size is-  
probably 4096. by deleting and re-creating this directory you will be  
saving a little over 9MB of disk space which is otherwise being wasted.

the second reason is speed. when files are deleted from a directory,  
their directory entries are marked as "not used", however the  
directory itself never shrinks.

i also suspect any existing entries after the deleted entry are not  
"shifted up" toward the beginning of the directory... which means if a  
directory contains 50,000 files, and you delete the first 49,000 of  
them, every time you access one of the remaining files it still has to  
read and skip those first 49,000 empty entries in order to get to the  
real files.

again, this is my suspicion; there is no easy way, other than manually  
examining the disk sectors by hand (ugly) or finding the relevant  
parts of the kernel source (not ugly, but time-consuming) to prove or  
disprove this is happening. the kernel doesn't provide any way to read  
the raw contents of a directory file; the closest thing available is  
to use the kernel's SYS_readdir call and look at the d_off values,  
which may or may not be useful, depending on what kernel version  
you're running. (the values appear to be meaningless, although  
ascending, on my centos 5 machine at home, kernel 2.6.18-92.1.18.el5.)

but either way, deleting and re-creating the directory is probably a  
good idea, and unless there's a process with an open handle to the  
directory or its contents while you do it, there shouldn't be any harm  
in it.

- ----------------------------------------------------------------
| John M. Simpson    ---   KG4ZOW   ---    Programmer At Large |
| http://www.jms1.net/                         <jms1 at jms1.net> |
- ----------------------------------------------------------------
| http://video.google.com/videoplay?docid=-1656880303867390173 |
- ----------------------------------------------------------------





-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkl7k4sACgkQj42MmpAUrRrWcgCfceTzCWVURmXuZxABAKqM9gG8
Ww8An0BFOAezSEMTqAns4/Nbe9v/cXHg
=RpUr
-----END PGP SIGNATURE-----

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Leaplist mailing list