Using the paralleljob command to submit jobs
On AVIDD, a script named paralleljob provides a convenient method for submitting some parallel (multiple-processor) programs to the PBS batching and queuing system. Suitable programs must consist of just one executable file (in contrast to some master/worker programs in which the master and workers are different executable files).The script is designed so that programs can be submitted as jobs by prefixing the command-line with the word paralleljob and by specifying the number of processors to use. The general form of the command is
paralleljob program-name [ program-options ] [ -CPUS np ] [ -wallhours n ] [ -- options-to-pbs ]
where program-name is the name of the program that you wish to be submitted as a job, program-options are command-line options that you would pass to the program, and np is the number of processors to use.
For example, suppose you've written a program called speedster that takes options that specify speed and the name of the file to be processed. To run the program with 16 processors, you would enter the command
paralleljob speedster -speed super mydata.dat -CPUS 16
If the program that you wish to run is not on your default path, use the fully qualified path name of the program.
When your job runs, the current working directory of your program is the directory from which you ran the paralleljob command.
The wallhours option is a convenient method of requesting more than the default amount of wall-clock time for a job. On AVIDD, jobs are allowed to run for two hours by default. If you need more time, specify the amount of time as an integer number of hours. For example, to run speedster and allow it to run for 26 hours, the command would be
paralleljob speedster -speed super mydata.dat -CPUS 16 -wallhours 26
Alternatively, you could request time using the PBS option.
paralleljob speedster -speed super mydata.dat -CPUS 16 -- -l walltime=26:0:0
Paralleljob should be on your path by default, and its manual page should be on your MANPATH by default. The best source of information about paralleljob job is its Unix manual page.
Limits
Paralleljob works only for parallel applications that are "single program multiple data" (i.e., a single binary). It does not work for programs that are "multiple programs multiple data" (i.e., programs that consist of more than one binary).
Paralleljob will not work for programs that make use of a -- (double-dash) argument because paralleljob uses that argument to separate options to your program from options to PBS.
If you need to quote arguments, paralleljob handles only double-quotes. It cannot provide the protection that is usually afforded by single-quotes because the Bourne shell provides no mechanism for escaping characters within strings in single-quotes. Single-quotes are treated as double-quotes by paralleljob.
I/O redirection is not possible with paralleljob. Briefly, paralleljob uses the mpiexec command to launch processes, and mpiexec does not support I/O redirection. If you need I/O redirection, you will have to prepare your own batch script and use the mpirun command that is associated with the compiler that was used to compile your program.




