A program to make some statistics on a filesystem
I wanted to know how files are distributed in my home directory, so I wrote a quick OCaml program to do this analysis. This program gives following result on my home directory (it counts links as files):
$ ./filesystem-analysis.ml ~
Total file size: 6120592065 bytes
Minimum non empty file size: 1.00 B
Maximum file size: 163.77 MB
Average file size: 28.81 KB
Total number of files: 207476
of which are empty files: 544
Total number of directories: 22932
File size distribution:
[ 1.00 B - 1.00 B ] 592 files - total average of 592.00 B
[ 2.00 B - 3.00 B ] 574 files - total average of 1.68 KB
[ 4.00 B - 7.00 B ] 772 files - total average of 4.52 KB
[ 8.00 B - 15.00 B ] 2205 files - total average of 25.84 KB
[ 16.00 B - 31.00 B ] 4077 files - total average of 95.55 KB
[ 32.00 B - 63.00 B ] 15371 files - total average of 720.52 KB
[ 64.00 B - 127.00 B ] 7598 files - total average of 712.31 KB
[ 128.00 B - 255.00 B ] 6258 files - total average of 1.15 MB
[ 256.00 B - 511.00 B ] 10922 files - total average of 4.00 MB
[ 512.00 B - 1023.00 B ] 21493 files - total average of 15.74 MB
[ 1.00 KB - 2.00 KB] 24063 files - total average of 35.25 MB
[ 2.00 KB - 4.00 KB] 35981 files - total average of 105.41 MB
[ 4.00 KB - 8.00 KB] 27669 files - total average of 162.12 MB
[ 8.00 KB - 16.00 KB] 16737 files - total average of 196.14 MB
[ 16.00 KB - 32.00 KB] 16654 files - total average of 390.33 MB
[ 32.00 KB - 64.00 KB] 7057 files - total average of 330.80 MB
[ 64.00 KB - 128.00 KB] 4190 files - total average of 392.81 MB
[ 128.00 KB - 256.00 KB] 2905 files - total average of 544.69 MB
[ 256.00 KB - 512.00 KB] 1036 files - total average of 388.50 MB
[ 512.00 KB - 1024.00 KB] 738 files - total average of 553.50 MB
[ 1.00 MB - 2.00 MB] 301 files - total average of 451.50 MB
[ 2.00 MB - 4.00 MB] 115 files - total average of 345.00 MB
[ 4.00 MB - 8.00 MB] 104 files - total average of 624.00 MB
[ 8.00 MB - 16.00 MB] 38 files - total average of 456.00 MB
[ 16.00 MB - 32.00 MB] 14 files - total average of 336.00 MB
[ 32.00 MB - 64.00 MB] 8 files - total average of 384.00 MB
[ 64.00 MB - 128.00 MB] 1 files - total average of 96.00 MB
[ 128.00 MB - 256.00 MB] 3 files - total average of 576.00 MB
Returned results are pretty expected. Over the 6GB of files in my home
directory, I have about 200,000 files of an average size of 30KB and
23,000 directories (about 10% of the number of files). The vast majority of
files are below 32KB but a small number of big files are taking the vast
majority of my disk space.
A more interesting point is that a non negligible number of files (10%)
is smaller than 64 bytes. In the case of the distributed backup system,
it would be time and space saving to save them within the master index.
2006-03-21T20:07:50Z [] permanent link
