Simple Globus Job runs on the Teragrid

This webpage is a screen dump of simple globus job trial runs in the Teragrid world. It assumes you have SSH key and GSI authentication (Globus certs) setup. If you have not, then you might want to take a look at this webpage: Setting up your Teragrid account for easier access


Here's a nice tutorial about globus jobs [might be dated though]: http://npacigrid.npaci.edu/tutorial.html#globusjobs


  • Login: Login from peart to NCSA:

    ag@peart agopu % ssh tg-login1.ncsa.teragrid.org
    ... M.o.t.d.
    ...
    less /etc/motd to see this message again
    Directory: /home/ncsa/agopu
    Thu Dec 16 15:20:38 CST 2004
    
    ncsa/agopu> cd tmp/
          
  • Get a proxy going...

    agopu/tmp> grid-proxy-init
    Your identity: /C=US/O=National Center for Supercomputing
    Applications/CN=Arvind Gopu
    Enter GRID pass phrase for this identity:
    Creating proxy ................................................. Done
    Your proxy is valid until: Fri Dec 17 03:32:25 2004
    
    /C=US/O=National Center for Supercomputing Applications/CN=Arvind Gopu/CN=proxy
    
  • Does the proxy work? Check if the proxy works - just authenticate to SDSC from NCSA

    agopu/tmp> globusrun -a -r tg-login.sdsc.teragrid.org
    GRAM Authentication test successful
    
  • HelloWorld!: Run a HelloWorld job at SDSC:

    agopu/tmp> more helloworld.rsl
    &
    (executable=/bin/echo)
    (arguments="Hello World!")
    
    agopu/tmp> globusrun -s -r tg-login.sdsc.teragrid.org -f /home/ncsa/agopu/tmp/helloworld.rsl
    Hello World!
          
  • Job on Remote site: Run a printDate job and check if an output file was created at the remote end

    agopu/tmp> more printDate.rsl
    &
    (executable=/bin/date)
    (stdout=remotePrintDate.out)
    
    agopu/tmp> globusrun -s -r tg-login.sdsc.teragrid.org -f /home/ncsa/agopu/tmp/printDate.rsl
    
    agopu/tmp> gsissh tg-login.sdsc.teragrid.org
    Welcome to the TeraGrid Itanium2 Linux Cluster
    ...
    ...
    Directory: /users/ux455224 Thu Dec 1613:28:05 PST 2004
    
    /users/ux455224> ls -l remotePrintDate.out
    -rw-------    1 ux455224 tgu221         29 2004-12-16 13:27 remotePrintDate.out
    
    /users/ux455224> more remotePrintDate.out
    Thu Dec 16 13:27:38 PST 2004
    
    /users/ux455224> logout
    Connection to tg-login.sdsc.teragrid.org closed.
    
  • Multi-site job: Run a job on multiple sites - a job which prints the hostname of the login node at each site

    agopu/tmp> more printHostname.rsl
    + 
    ( & (resourceManagerContact="tg-login.caltech.teragrid.org/jobmanager-fork")
        (executable=/bin/hostname)
        (arguments=-f)
    )
    ( & (resourceManagerContact="tg-login1.iu.teragrid.org")
        (executable=/bin/hostname)
        (arguments=-f)
    )
          
    agopu/tmp> globusrun -s -f /home/ncsa/agopu/tmp/printHostname.rsl
    
    tg-login1.caltech.teragrid.org
    th1.avidd.iu.edu
    
    Note: Jobs that use significant resources at multiple sites simultaneously are not easily scheduled.

    Also, there have been problems reported with current versions of Globus/MPICH-G2. This thread on the globus form has some relevant information about using mpirun in place of globusrun and so forth.