Automatically Killing a Runaway Job

Follow

Jobs that go into a Runaway Status (ones that have executed too long) can be handled automatically by adding a Runaway Element:

  1. Right-click Job/Folder and select Properties.
  2. Navigate to the Elements Tab and select Add.
  3. In the dialog window, expand the EventHandler menu and choose the Runaway Element.

An Elapsed Time can be entered with the value of Windows System Time, or the Runaway Elapsed Percentage of Completion can be selected. From here, an Action can be taken, such as Cancelling with any form of Completion Severity, or simply Sending a Notification to allow the Job to continue executing.

In the example below, the Job is configured to cancel with Error severity if the Job takes 5 hours, or 500% of its normal execution time.

Have more questions? Submit a request

Comments

  • Avatar
    Jared Stroebele

    The RunawayElapsedPercent cannot be blank, if left at 0 does that then only alert off the RunawayElapsed time?

  • Avatar
    Gennaro Piccolo

    Hello Jared, if it is left at 0 then it will only alert off the Runaway Elapsed, that is correct.

  • Avatar
    Aye Sone

    If a job is scheduled to run every 15 min, will this resubmit the job?

  • Avatar
    Gennaro Piccolo

    Hello Aye, this element is designed to notify or to notify and cancel a job if it runs longer than a certain length of time, or a certain percent of time longer than average.

    A resubmit or a repeat element should be used to run a job on a specific interval.

  • Avatar
    Artur Brandys

    So if the job is running to long and we want to restart it (because some process will hang) what should we do?

  • Avatar
    Gennaro Piccolo

    Hello Artur, you can use a Notification Job to kill the Runaway job and then resubmit it. The notification job is responsible for killing the runaway job and resubmitting it immediately. The code below is defined in a Powershell job, and then set as a Notification job on the job you want to be resubmitted when it runs long.

    $JAMS_NOTIFY_REASON = "<<JAMS_NOTIFY_REASON>>"
    if ($JAMS_NOTIFY_REASON -eq 'RUNAWAY')
    {
    write-host "Killing runaway entry <<JAMS_NOTIFY_ENTRY>>"
    Import-Module JAMS
    Stop-JAMSEntry <<JAMS_NOTIFY_ENTRY>> -confirm:$false -Server Localhost
    Submit-JAMSEntry <<JAMS_NOTIFY_JOB_NAME>> -Server Localhost
    }

    Edited by Gennaro Piccolo
  • Avatar
    Artur Brandys

    OK thank you, but I don't know exactly what the first sign <> means, maybe value of the parameter, could you explain? Because the second, third and fourth are the full path and the name of the first job I think.

  • Avatar
    Gennaro Piccolo

    My apologies Artur, the web page corrupted the JAMS variable symbols I was using. It should be correct now.

  • Avatar
    Gennaro Piccolo

    The <> in the first line is supposed to be <>. The web page seems to be escaping everything within the double angle brackets. I will open a ticket and send you a copy of the script so you can see the script more clearly.

  • Avatar
    Artur Brandys

    OK thank you Gennaro.
    I think that in the script it should be not <> but <> but with that everything works fine now :)

  • Avatar
    Gennaro Piccolo

    I created a ticket and sent you the code so there were no issues in the translation of the code. Glad to hear everything is working fine now.

  • Avatar
    Dan Fowle

    For clarity, there should be two < symbols before, and two > symbols after the JAMS_NOTIFY_* built in parameters.

  • Avatar
    Sri Chandra

    Hey gennaro,

    In the above powershell code , where are fetching the jobs list that are running for longer time ? so that we can pass the list for jobs for the above code kill them and resubmit for another run .

  • Avatar
    Gennaro Piccolo

    Hello Sri, This job is a notification job. It is defined in the notification job element. It is not searching through the entire list of running jobs, its only attached to one job. This job is triggered when the job sends a runaway notification, and then kills the job.

  • Avatar
    Sri Chandra

    Hello Gennaro,
    So can we implement the same operation for all the jobs through a single job?

  • Avatar
    Gennaro Piccolo

    You can assign a notification job to a folder, so that any job in that folder will run this job when it is in a runaway status and then resubmit it. It is also possible to assign it directly to a single job.