Condor jobs submitted via CMS-Connect will be automatically reported to CMS Dashboard, in a similar way CRAB does. A basic report that doesn't require any particular action from the user is done by default, but users are encouraged to provide a few parameters in their submission workflows in order to do handle e.g: stage-in, stage-out and full error code management in the report.
The reporting procedure is done in 2 steps:
- Report from the Submission Machine:
The whole task is registered and sent to Dashboard from the submission machine while using condor_submit.
Report from the Worker Node
Each job is reported once it is assigned to an available machine and executed from it.
As opposed to regular CRAB workflows, users define their own submission scripts in CMS-Connect (as in any regular condor workflow). Due to this fact, tasks like stage-out, stage-in and error code management are implemented and handled by each user. For this reason, only a few parameters are reported by default, without the need of any further action from the user.
Basic Report (Default)
The basic report is handled by CMS-Connect wrappers and there no user-side action is required for it. This report includes the following:
- Start and End time of report
- Executable CPU and WallClock time
- Executable exit code
Please, notice that if the user submits a wrapper on top of the executable, the wrapper exit code and times will be reported, unless the user specifies such values (see Advanced Report).
- Hostname of machine where the job was executed
- Computing Element Name
Please, see the Advanced Report in order to report stage-in/stage-out times and exit codes, number of events in the job or to override some of the default parameters.
The following parameters can be specified by the user in order to report more advanced parameters from the worker node to the CMS Dashboard. The only requirement is to print out such parameters in the format:
PARAMETER = VALUE
# Example: Print this out at the end of your job to report the number of events on it.
CMS_DASHBOARD_N_EVENTS = 5000
The following table provides a list of the parameters than can be reported from the user side and the default values for the basic report case.
|CMS_DASHBOARD_N_EVENTS||Number of events in the job. Default: 0|
|CMS_DASHBOARD_EXE_WC_TIME||Executable wall clock time. Default: Condor executable WC time.|
|CMS_DASHBOARD_EXE_CPU_TIME||Executable CPU time. Default: Condor executable CPU time.|
Executable exit code. Default: Condor Executable exit code.
|Note: The user might want to override the default values for EXE_WC_TIME, EXE_CPU_TIME and EXE_EXIT_CODE in cases where e.g the Condor Executable is just a user wrapper running the actual executable.|
|CMS_DASHBOARD_STAGEOUT_SE||Storage Element name. Default: unknown.|
|CMS_DASHBOARD_STAGEOUT_EXIT_CODE||Stage out exit code.|
|CMS_DASHBOARD_STAGEOUT_TIME||Stage out exit time.|
|CMS_DASHBOARD_JOB_EXIT_CODE||Job Exit code. Default: Executable exit code.|
User can report their own job exit codes to handle the overall completion state of the job.
|CMS_DASHBOARD_JOB_EXIT_REASON||Job Exit Reason. Default: Empty|
You can follow this Twiki link to find more information about job monitoring with CMS Dashboard.
Historical View Example
Example CMS-Connect jobs reported to Dashboard.