Writing short utilities in other languages



C / C++

You can of course write short utilities and CGI scripts in a compiled language like C / C++. However, there are many types of short programs that make much more sense as an interpreted batch file than as a HLL compiled program.




Compiling a C++ program in UNIX

 edit prog.cxx
 CC prog.cxx
This creates a binary executable file called   a.out  
You might do   mv a.out prog   or alternatively,   CC   has a command-line switch to name the output file. See:   CC -help  

You may or may not need to:   chmod +x prog  
Then just run:

 prog (args)
For a more sophisticated development environment, see the Sun workshop:
 workshop &




Accessing the command-line from C++

In a HLL, calling other programs (and access to the command-line in general) is more awkward than in a command-line-oriented batch file. In C/C++ you use the system() call:

  #include <stdlib.h>

  main()
  {
   system ( "rm file7.txt" );
  }
which is obviously more complex than the Shell:
  rm file7.txt
You also have to compile the program, and keep track of 2 files - the source and the binary. In Shell, there is only 1 file. Also, the above is alright if filenames are static. But consider where the file name is variable. In Shell:
  for i in 96 97 98 99 00
  do
   rm $i.log $i.txt
  done
In C++ this is much more complex:
  #include <stdio.h>
  #include <stdlib.h>

  main()
  {
   char buf [ 30 ];

   for ( int i=96; i<=99; i++ )
   {
    sprintf ( buf, "rm %d.log %d.txt", i, i );
    system ( buf );
   }

   system ( "rm 00.log 00.txt" );
  }

And if you want to get directory listings you really want to use Shell. How would you code the following Shell script in C++?
  for i in */*doc */*xls
  do
   cp $i $HOME/backups/$i
  done
Access to environment variables is also more awkward in the HLL:
  #include <stdlib.h>

  char *homestring = getenv ( "HOME" );
Constructing command-lines in general, with environment variables, variable and wildcard filenames, and piping and redirection, is much more cumbersome. And another drawback is that the programs have to be recompiled any time you move to a new system.



Data structures and arithmetic in C++ v. Shell

On the plus side of course, is that sophisticated data structures exist in the HLL and not in Shell. For instance, there are no types in Shell. You cannot do:

  int i=0;

  if ( condition ) i++;
All you have in Shell are simple TAG=VALUE pairs of text strings. So you can only set simple flags:
 flag=0

 if condition
 then
  flag=1
 fi



Although in fact there is a way to do arithmetic comparisons. For instance, test if the argument is less than 50:

 if test $1 -lt 50
And using the "expr" command, you can do arithmetic:

# x = $1 + $2

x=`expr $1 + $2`

echo $x

In fact, for the flag above, we can write the following in Shell:

flag=0
echo $flag

while test $flag -lt 30
do
 flag=`expr $flag + 1`
 echo $flag
done

which is equivalent to the following in C++:
int flag = 0;
cout << flag << "\n";

while ( flag < 30 )
{
 flag++;
 cout << flag << "\n";
}

The C++ program will run a lot faster of course!

Question - Why does the C++ program run a lot faster?


Hint:

$ which expr
/bin/expr

$ ls -l /bin/expr
-r-xr-xr-x   1 bin      bin        20988 May  3  1996 /bin/expr
  flag++;   = say 3 machine instructions (what 3?)
30 times this = 90 machine instructions.

  flag=`expr $flag + 1`   = say 300 machine instructions (why?)
30 times this = 9000 machine instructions.



So despite the fact that Shell environment variables are only text strings, and have no types, we can use other programs (test, expr) that interpret their text string arguments in certain ways, and so we can use them as numeric types after all.

But this is only the start of it of course. In C++, you also have large data types like arrays, recursive function calls, object-oriented classes with inheritance, libraries of useful functions, and all the other equipment of a HLL. In Shell, you need to construct your functionality by piping together lots of tools at the command-line. This can get very slow and cumbersome for large, complex programs.

And obviously for any serious application (for instance, anything with a windowed user interface, or anything with threads) you turn to a HLL.


Conclusion

I tend to write my utilities in Shell if possible. For the more complex utilities, I move up to C++. For the simpler utilities I try to express them as aliases (see below).

Often, I use both. I surround a C++ utility with a small Shell wrapper that prepares the filenames and environment variables, calls the C++ program, and then possibly does some processing of the output.

In my CGI scripts, I surround Shell utilities with C++ input pre-processing wrappers.



Perl

Perl is an interpreted language designed to give much of the functionality of a language like C++ in the interpreted world of Shell - with direct access to the command-line.

Perl is popular in particular for CGI scripts (but by no means necessary).



aliases

For very short and simple Shell programs, it is more efficient to replace them with aliases in your .cshrc file. For instance, if you regularly need to login to some other server, put the following in .cshrc:

  alias t 'telnet -l userid remoteserver.dcs.university.ac.uk'
and then, to connect to it, just type:
  t
Or if you regularly need to jump to your web directory, put the following in .cshrc:
  alias cdp 'cd $home/public_html'
and then, any time you want to jump to that directory, just type:
  cdp
This straight text substitution at the command-line is more efficient than starting up a Shell script that needs parameters set up for it (environment variables, command-line arguments) and then needs to be interpreted. It is like the difference between a macro text substitution (a #define) and a run-time procedure call.

In fact, in the case of "cdp" above, a Shell program won't work since a Shell program can't change the directory of the parent process that called it.




Summary - What language should I use?

I want to customise my system, and automate many tasks.
Like any good programmer, I am always starting to write programs. How should I approach writing small custom utilities?

  1. Very simple customisation - Check out program preferences or command-line arguments.
  2. 1-liner utilities - aliases
  3. Command-line utilities with some logic - Shell
  4. Complex command-line utilities - Perl
  5. Small applications doing lots of calculations - C++ (or HLL of your choice)
  6. Complex applications - Before investing a load of time in writing it yourself, maybe look online for freeware, shareware, or even a full commercial product.




.BAT files in DOS

The DOS command-line on Microsoft Windows also has a scripting language. You put your commands in a "batch file" with a name like PROG.BAT, and then to run it type PROG. The scripting language is a lot more primitive than Shell. For instance, there is an IF statement, but no ELSE, so you must repeat the condition:

  @echo off
  if     exist "C:\Program Files\Netscape" echo Netscape is installed.
  if not exist "C:\Program Files\Netscape" echo Netscape is not installed.
Also, the body of the IF statement can only have 1 instruction. This can be got around (as can the ELSE problem) by using a GOTO to jump to a label further down, or a CALL of another batch file, but this is all a lot more awkward than in Shell.

You also have string compare:

  if '%1'=='' echo No arg.
FOR loops:
  for %%i in (*.html) do call secondprog %%i
Environment variables:
  echo path is %path%

  copy %1 %homepath%\backup

Explore the site below for more examples.

Similar to UNIX, on DOS/Windows you may or may not make use of this programmable command-line (you can survive without ever going near it). If you use it you might adopt a similar policy - write your utilities if possible as batch files, and only turn to a HLL for the more complex utilities.

For example:

  1. Earlier in my career, I spent most of my time on DOS/Windows systems, and wrote all my utilities in a combination of DOS BAT files and Pascal EXE's.
  2. For the last number of years, I have spent most of my time on UNIX systems, and have a similar system, with all my utilities in a combination of Shell scripts and C++ binaries.
You may prefer to use a combination of Perl scripts and Java programs (respectively). The principle is the same.