Using WorkQueue

The CMS Dashboard implementation doesn't currently work with WorkQueue[1], so you need to disable it. This can be done by simply exporting CONDOR_CMS_DASHBOARD to False before starting your workers. You will also need to define the Sites you want to run at via DESIRED_Sites. If this attribute is not set, US T2 and T3 Sites will be set by Default. Here is an example


bash
# Use all Tier US Sites
source /etc/ciconnect/set_condor_sites.sh T?_US_*

# Disable built-in Dashboard reporting
export CONDOR_CMS_DASHBOARD=False
 
# Start your workers
work_queue_factory -T condor -C <config.json> >& logfile.log &

where <config.json> provides the project name (master-name) and the resource configuration you would like for your workers:

wq_config.json example
{
  "master-name": "lobster_tau_v29",
  "max-workers": 1000,
  "min-workers": 0,
  "cores": 4,
  "memory": 7200,
  "disk": 8000,
  "tasks-per-worker": 2.0
}

Connecting your Workers with a Master in a remote network behind a firewall

If you happen to have a WQ master on a machine in a remote network sitting behind a firewall but you still can ssh to it from login-el7.uscms.org, you can do so through an ssh-tunnel and foremen.

  1. First, you need to find out your Master port number. You can do so by querying work_queue_status from any machine with cctools setup. You can use -M to define the project name.

    bash
    $ work_queue_status  -M lobster_tau_v29
    
    PROJECT            HOST                   PORT WAITING RUNNING COMPLETE WORKERS 
    lobster_tau_v29    earth.crc.nd.edu       9000       0    0        0       0 
  2. Now, you can create an ssh-tunnel from login-el7.uscms.org:

    bash
    [login-el7.uscms.org]$ ssh -L 9666:localhost:9000 <user>@<remote-machine> -N &
  3. To start a foreman

    bash
    #! /bin/bash
    nohup work_queue_worker -dall --foreman-name tau_v29-1 -s /home/<user>/wq-foremen --specify-log foreman1.log -o foreman1.debug localhost 9666 > nohup_foreman1.log 2>&1 &
     

    you can start more by increasing the foreman-name counter tau_v29-1 to tau_v29-2, tau_v29-3,  etc.

  4. Your foremen should connect to your Master as Workers. You can verify that with work_queue_status as in the first step.

    bash
    $ work_queue_status  -M lobster_tau_v29
    
    PROJECT            HOST                   PORT WAITING RUNNING COMPLETE WORKERS 
    lobster_tau_v29    earth.crc.nd.edu       9000       0    0        0       1 
  5.  Now, you can run your factory as usual. Here is a configuration example:

    bash
    #! /bin/bash
    # Use all Tier US Sites
    source /etc/ciconnect/set_condor_sites.sh T?_US_*
    # Disable built-in Dashboard reporting
    export CONDOR_CMS_DASHBOARD=False
     
    # start your workers
    work_queue_factory -T condor -C <config>.json >& logfile.log &


    Note: You would need to define your foremen in the configuration. Here is an example for illustration purporses, but refer to [ 1 ] for more information

    bash
    $ cat wq_config.json 
    {
      "master-name": "lobster_tau_v29",
      "foremen-name": "tau_v29.*"
      "max-workers": 1000,
      "min-workers": 0,
      "cores": 4,
      "memory": 7200,
      "disk": 8000,
      "tasks-per-worker": 2.0
    }


[ 1 ] http://ccl.cse.nd.edu/software/manuals/workqueue.html