yet another blog : en : en : en/misc

Wed, 29 Mar 2006

How arguments are passed to a shebang script

When implementing ocamlscript, I realized that argument passing for a shebang script (a Unix script starting with #!) was rather undocumented. So here is a tiny documentation of this rather contrived argument passing convention.

Suppose you have the OCaml program:

let _ =
  for i = 0 to Array.length Sys.argv  - 1 do
    Format.printf "argv(%d): \"%s\"@\n" i Sys.argv.(i)
  done
Compiled with: ocamlopt -o argv-test argv-test.ml.

And suppose you have following script-with-args shebang script:

#!./argv-test intarg1 intarg2 intarg3

and following script-without-args shebang script:

#!./argv-test

As a reference, if argv-test is called directly, you have:

$ ./argv-test arg1 arg2 arg3
argv(0): "./argv-test"
argv(1): "arg1"
argv(2): "arg2"
argv(3): "arg3"
Now, let's compare the different ways of calling the shebang script:
$ ./script-without-args 
argv(0): "./argv-test"
argv(1): "./script-without-args"

$ ./script-without-args arg1 arg2 arg3
argv(0): "./argv-test"
argv(1): "./script-without-args"
argv(2): "arg1"
argv(3): "arg2"
argv(4): "arg3"

$ ./script-with-args 
argv(0): "./argv-test"
argv(1): "intarg1 intarg2 intarg3"
argv(2): "./script-with-args"

$ ./script-with-args arg1 arg2 arg3
argv(0): "./argv-test"
argv(1): "intarg1 intarg2 intarg3"
argv(2): "./script-with-args"
argv(3): "arg1"
argv(4): "arg2"
argv(5): "arg3"

You see the pattern? :) If the shebang script is called with arguments given on the shebang line, those arguments are available in a single string as argv(1). Otherwise, argv(1) is equal to the name of the shebang script itself. If any argument in given on the command line to the shebang script, they are available as normal argument, after binary name, optional internal arguments and script name. To determine if the shebang script has internal arguments: if the called binary has at least three arguments, one should test if argv(1) is the name of an real file (and thus the script has not argument) or not (and thus the script has arguments).

As a side note, if one wants to use the env utility:

$ cat ./env-script
#!/usr/bin/env ./argv-test 

$ ./env-script 
argv(0): "./argv-test"
argv(1): "./env-script"

$ ./env-script arg1 arg2 arg3
argv(0): "./argv-test"
argv(1): "./env-script"
argv(2): "arg1"
argv(3): "arg2"
argv(4): "arg3"

However env does not support arguments on the shebang line:

$ cat ./env-script-with-args 
#!/usr/bin/env ./argv-test intarg1 intarg2 intarg3

$ ./env-script-with-args 
/usr/bin/env: ./argv-test intarg1 intarg2 intarg3: No such file or directory

2006-03-29T19:38:24Z [] permanent link

Tue, 28 Mar 2006

Opening of diseba project on Gna!

I have opened diseba project on Gna!. You can browse the source code, download it under Subversion (aka svn) and subscribe to the mailing lists. Please join us!

2006-03-28T19:37:42Z [] permanent link

Tue, 21 Mar 2006

A program to make some statistics on a filesystem

I wanted to know how files are distributed in my home directory, so I wrote a quick OCaml program to do this analysis. This program gives following result on my home directory (it counts links as files):

$ ./filesystem-analysis.ml ~
Total file size:  6120592065 bytes
Minimum non empty file size:    1.00 B 
Maximum file size:  163.77 MB
Average file size:   28.81 KB
Total number of files: 207476
 of which are empty files: 544
Total number of directories: 22932
File size distribution:
[   1.00 B  -    1.00 B ]     592 files - total average of  592.00 B 
[   2.00 B  -    3.00 B ]     574 files - total average of    1.68 KB
[   4.00 B  -    7.00 B ]     772 files - total average of    4.52 KB
[   8.00 B  -   15.00 B ]    2205 files - total average of   25.84 KB
[  16.00 B  -   31.00 B ]    4077 files - total average of   95.55 KB
[  32.00 B  -   63.00 B ]   15371 files - total average of  720.52 KB
[  64.00 B  -  127.00 B ]    7598 files - total average of  712.31 KB
[ 128.00 B  -  255.00 B ]    6258 files - total average of    1.15 MB
[ 256.00 B  -  511.00 B ]   10922 files - total average of    4.00 MB
[ 512.00 B  - 1023.00 B ]   21493 files - total average of   15.74 MB
[   1.00 KB -    2.00 KB]   24063 files - total average of   35.25 MB
[   2.00 KB -    4.00 KB]   35981 files - total average of  105.41 MB
[   4.00 KB -    8.00 KB]   27669 files - total average of  162.12 MB
[   8.00 KB -   16.00 KB]   16737 files - total average of  196.14 MB
[  16.00 KB -   32.00 KB]   16654 files - total average of  390.33 MB
[  32.00 KB -   64.00 KB]    7057 files - total average of  330.80 MB
[  64.00 KB -  128.00 KB]    4190 files - total average of  392.81 MB
[ 128.00 KB -  256.00 KB]    2905 files - total average of  544.69 MB
[ 256.00 KB -  512.00 KB]    1036 files - total average of  388.50 MB
[ 512.00 KB - 1024.00 KB]     738 files - total average of  553.50 MB
[   1.00 MB -    2.00 MB]     301 files - total average of  451.50 MB
[   2.00 MB -    4.00 MB]     115 files - total average of  345.00 MB
[   4.00 MB -    8.00 MB]     104 files - total average of  624.00 MB
[   8.00 MB -   16.00 MB]      38 files - total average of  456.00 MB
[  16.00 MB -   32.00 MB]      14 files - total average of  336.00 MB
[  32.00 MB -   64.00 MB]       8 files - total average of  384.00 MB
[  64.00 MB -  128.00 MB]       1 files - total average of   96.00 MB
[ 128.00 MB -  256.00 MB]       3 files - total average of  576.00 MB

Returned results are pretty expected. Over the 6GB of files in my home directory, I have about 200,000 files of an average size of 30KB and 23,000 directories (about 10% of the number of files). The vast majority of files are below 32KB but a small number of big files are taking the vast majority of my disk space.

A more interesting point is that a non negligible number of files (10%) is smaller than 64 bytes. In the case of the distributed backup system, it would be time and space saving to save them within the master index.

2006-03-21T20:07:50Z [] permanent link

Made with PyBlosxom